Question 1

The \((X'X)^{-1}\) for the \(y=β_0+β_1 x_1+β_2 x_2+β_3 x_3+β_4 x_4+β_5 x_5+β_6 x_6+ε\) is given below.

  1. If MSE = 1.395 and n = 38 , compute the

\[se(\mathbf{\hat\beta_4})=\sqrt{MSE\times C_{55}}=\sqrt{1.395\times0.069}=0.3102499\]

\[Cov(\mathbf{\hat\beta_2,\hat\beta_4})=MSE\times C_{35}=1.395\times(-0.035)=-0.048825\]

\[se(\mathbf{\hat\beta_2})=\sqrt{MSE\times C_{33}}=\sqrt{1.395\times0.067}=0.3057205\]

\[Cor(\mathbf{\hat\beta_2,\hat\beta_4})=\frac{Cov(\mathbf{\hat\beta_2,\hat\beta_4})}{se(\mathbf{\hat\beta_2})se(\mathbf{\hat\beta_4})}=\frac{-0.048825}{0.3057205\times0.3102499}=-0.5147615\]

\(C_{66}=0.058\) has the smallest value. \(\hatβ_5\) has the the least variance and is the most consistent among the estimators.

According to the \((X'X)^{(-1)}\), \(C_{13},\ C_{17},\ C_{24},\ C_{25},\ C_{67}\) are positive.

Therefore, the positively correlated pairs of parameters are

\[\hatβ_0\ \&\ \hatβ_2,\quad \hatβ_0\ \&\ \hatβ_6,\quad \hatβ_1\ \&\ \hatβ_3,\quad \hatβ_1\ \&\ \hatβ_4,\quad \hatβ_5\ \&\ \hatβ_6\]

  1. Consider the following hypothesis: \(H_0: β_1=2β_3,β_2=β_3,β_5=0\)

\[ \mathbf{T}=\begin{bmatrix} 0 & 1 & 0 & -2 & 0 & 0& 0 \\ 0 & 0 & 1 & -1 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 0 \end{bmatrix}_{3\times7} \mathbf{β}=\begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \\ \beta_3 \\ \beta_4 \\ \beta_5 \\ \beta_6 \end{bmatrix}_{7\times1} \mathbf{C}=\begin{bmatrix} 0 \\ 0 \\ 0\end{bmatrix}_{3\times1} rank(T)=3 \]

In this hypothesis, \(y=β_0+2β_3x_1+β_3x_2+β_3x_3+β_4x_4+0x_5+β_6x_6+ε=β_0+β_3(2x_1+x_2+x_3)+β_4x_4+β_6x_6+ε\)

The full model has 6 parameters for predictors while reduced model has 3.

\[F_0=\frac{\frac{SSE_{Reduced}-SSE_{Full}}{dfE_{Reduced}-dfE_{Full}}}{\frac{SSE_{Full}}{dfE_{Full}}}=\frac{\frac{SSE_{Reduced}-SSE_{Full}}{[n-(k+1-r)]-[n-(k+1)]}}{\frac{SSE_{Full}}{n-(k+1)}}=\frac{\frac{SSE_{Reduced}-SSE_{Full}}{[38-(6+1-3)]-[38-(6+1)]}}{\frac{SSE_{Full}}{38-(6+1)}}=\frac{\frac{SSE_{Reduced}-SSE_{Full}}{3}}{\frac{SSE_{Full}}{31}}\]

In this conceptual form, the numerator degrees of freedom is \(\nu_1=3\), denominator is \(\nu_2=31\).

After transformed to a colsed form, The numerator is 31, denominator is 3.

\[SSR=\sum_{i=1}^n(\hat y_i-\bar y)^2=\sum_{i=1}^n(\hat y_i^2-2\hat y_i\bar y+\bar y^2)=\sum_{i=1}^n\hat y_i^2-2\bar y\sum_{i=1}^n\hat y_i+\sum_{i=1}^n\bar y^2\]

\[=\sum_{i=1}^n\hat y_i^2-2\bar yn\frac{\sum_{i=1}^n\hat y_i}n+n\bar y^2=\sum_{i=1}^n\hat y_i^2-2\bar yn\bar y+n\bar y^2=\sum_{i=1}^n\hat y_i^2-n\bar y^2\]

Question 2

  1. The matrix of scatterplots and correlation


  1. Which predictors are significantly related to (most likely to contribute to the variation in) the response variable.

Based on scatterplots and correlation, \(Cor_{y,x_4}=0.866, Cor_{y,x_1}=0.781, Cor_{y,x_7}=0.668, Cor_{y,x_2}=0.666\) have medium to strong positive linear relationship to the response variable (Correlation coefficient is more than 0.6). \(Cor_{y,x_5}=-0.62\) has medium negative linear relationship to the response variable.


  1. Fit the full model.

\[\hat y=292.561-203.144X_1+ 1055.782X_2-49.24X_3+209.762X_4-10.197X_5-24.558X_6+142.778X_7+511.713X_8-301.872X_9\]


  1. Significant

The fitted overall model is statistically significant at 5% significance level (\(p-value=9.744\times^{-06}\)).

But most of the coefficients are not significent. This model is not the best fitted model.


  1. The violation of random errors
  • Residual Diagnostics: There is some violations. The model didn’t satisfied the OLS assumptions of random errors.

On the residual plot, there is a funnel pattern.

On the outlier and leverage plot, there are two outliers.

On the qq plot, most of points follow approximately straight line but have some positive skew.

  • Suggestion: Transform and more diagnostics.

I suggest using natural log of response to make a variance-stabilizing transformations.

Other diagnostics of heteroskedasticity, variable selection, measures of influence also should be considered.


  1. The partial sum of squares explained by rainfall.

Accroding to the F test, the partial sum of squares explained by rainfall is 2209825, given that all the other regression coefficients are in the model.


  1. Multicollinearity

According to the result of VIF test (variance inflation factor), the model does have serious problems of multicollinearity. The VIF of variables X4 (105.754708), X1 (101.859709), X3 (31.446394), X7(20.53505) are larger than 10.


  1. Interpret the estimated coefficient of rainfall predictor

Coefficient of 511.713 in the full model suggests the peak rate of flow increases by 511.713 cubic feet per second when the rainfall increases by 1 inch and other variables are constants.


  1. Log of response

\[\ln(\hat y)=3.402256-0.013532X_1-1.023664X_2+0.177966X_3+0.108788X_4\] \[-0.009622X_5-0.389474X_6+4.233475X_7+0.63007X_8-0.462276X_9\]

  1. Significant

The overall fitted model is statistically significant at 5% significance level (\(p-value=7.513\times10^{-11}\)).

But most of the coefficients are not significent. This model is not the best fitted model.


  1. Multicollinearity.

The model still has serious problems of multicollinearity. The variance-stabilizing transformations does not change the value of VIF: X4 (105.754708), X1 (101.859709), X3 (31.446394), X7(20.53505).


  1. If you wanted to simplify this full model, explain which predictor you would eliminate first.
  • VIF

If just considering the VIF, X4 (105.754708) or X1 (101.859709) with largest VIF values is the first to remove.

  • Correlation with y

However, according to the correlation coefficients, X4, X1, and X7 have strongly correlation with y (\(Cor_{y,x_4}=0.866, Cor_{y,x_1}=0.781, Cor_{y,x_7}=0.668\)). The textbook suggest that the general approaches for dealing with multicollinearity include collecting additional data, model respecification (redefine the regressors, variable elimination), estimation methods (Ridge Regression, Principal-Component Regression). “Variable elimination is often a highly effective technique. However, it may not provide a satisfactory solution if the regressors dropped from the model have significant explanatory power relative to the response y. That is, eliminating regressors to reduce multicollinearity may damage the predictive power of the model.” (Montgomery et al., 2012. p.304) In this way, the third multicollinear X3 (31.446394) with a weak relationship with y (0.205) should be considered.

  • Correlation with each other

According to the variable names of X4, X1, and X3, they are geographic variables. Predictor X1 is the area of watershed while X4 is the longest stream flow in watershed, x3 is the average slope of watershed. For the given 6 watersheds, X1 and X4 are strongly related. A high correlation (0.921) is expected between these two variables. But X3 is not significently related with X1(-0.078) or X4 (0.245). Removing X3 might lose some irreplacable infromation. I

  • elimination test

Actrually, I don’t agree remove any predictor in this stage. Removing any predictor can draw down the VIF significently. After elimination regression, the multicollinearity dissapeared in all the models. We should take more diagnostics and comparisons, gather sufficient evidents before removing any predictor.

remove Max.VIF R-squared Eliminated model New Max.VIF New R-squared
none 106(X4) 0.947 / ,/ ,X3,X4,/ ,X6,X7,X8,X9 X7=6.28 0.947
X1 8.70(X4) 0.947 / ,/ ,X3,X4,/ ,X6,X7,X8,X9 X7=6.28 0.947
X2 67.4(X4) 0.947 / ,/ ,X3,X4,/ ,X6,X7,X8,X9 X7=6.28 0.947
X3 8.87(X4) 0.941 X1,X2,/ ,X4,X5,/ ,/ ,X8,X9 X1=8.39 0.937
X4 8.38(X1) 0.945 X1,/ ,X3,/ ,/ ,X6,X7,X8,X9 X9=5.10 0.943
X5 58.4(X4) 0.947 / ,/ ,X3,X4,/ ,X6,X7,X8,X9 X7=6.28 0.947
X6 54.8(X4) 0.939 X1,X2,/ ,X4,X5,/ ,/ ,X8,X9 X1=8.39 0.937
X7 42.0(X1) 0.939 X1,X2,/ ,X4,X5,/ ,/ ,X8,X9 X1=8.39 0.937
X8 103(X4) 0.900 X1,/ ,X3,/ ,/ ,X6,X7,/ ,/ X7=3.96 0.893
X9 100(X4) 0.910 X1,/ ,X3,/ ,/ ,X6,X7,X8,/ X7=3.97 0.906

Before elimination, We can find removing x8 or x9 hurts the R-squared most, while removing x1, x2, or x5 affects lesat. After elimination, We can get 3 kinds of model:‘346789’(\(R^2=0.947\));‘124589’(\(R^2=0.937\));‘1367(8)(9)’(\(R^2=0.943~0.906\)). It is clear that X1 contain some useful infromation, X2 and X5 help least.

Turning back to the correlation coefficients, X2 and X5 have medium correlation with y ($Cor_{y,x_2}=0.666, Cor_{y,x_5}=-0.62). It is not much different. We look at the context, X5 (surface absorbency index) and X2 (Area impervious to water) have similar effects on initial flow (see the discussion at the end).

Finaliy, Among the four variables (X2, X5, X6, X7) of surface porperties related with initial flow, X2 and X7 have highest correlation, which means X7 may contain the most same information contained in x2. Therefore, if we have to remove one variable, X2 is the best option.

X1 X2 X3 X4 X5 X6 X7 X8 X9 y
X2 0.80125933 1.00000000 -0.07302375 0.76056089 -0.48607440 0.06598115 0.8324480 0.10540203 0.13331333 0.66566120
X5 -0.73673561 -0.48607440 -0.40289059 -0.77701188 1.00000000 -0.27043338 -0.4814787 -0.04005129 -0.10502725 -0.62017585

  1. The forward selection method (use α=0.15)

Use Stepwise Forward Regression based on p values (use α=0.15)

\[\ln(\hat y)=2.872+0.168X_3+0.122X_4+3.106X_7\]

Use Stepwise AIC Forwardd Regression

\[\ln(\hat y)=2.692+0.184X_3+0.109X_4-0.368X_6+4.085X_7+0.612X_8-0.448X_9\]


  1. The backward elimination method (use α=0.05)

Stepwise Backward Regression based on p values (use α=0.05) and Stepwise AIC Backward Regression have same results.

\[\ln(\hat y)=2.692+0.184X_3+0.109X_4-0.368X_6+4.085X_7+0.612X_8-0.448X_9\]


  1. Best subsets method

Best subsets method gives a same model.

\[\ln(\hat y)=2.692+0.184X_3+0.109X_4-0.368X_6+4.085X_7+0.612X_8-0.448X_9\]


  1. Compare and suggest one best model
Method By Keep Remove
Stepwise Forward P=0.15 X3, X4, X7 X1,X2,X5,X6,X8,X9
Stepwise Forward AIC X3,X4,X6,X7,X8,X9 X1,X2,X5
Stepwise Backward P=0.05 X3,X4,X6,X7,X8,X9 X1,X2,X5
Stepwise Backward AIC X3,X4,X6,X7,X8,X9 X1,X2,X5
Stepwise Both P X3, X4, X7 X1,X2,X5,X6,X8,X9
Stepwise Both AIC X3,X4,X6,X7,X8,X9 X1,X2,X5
Best Subset p X3,X4,X6,X7,X8,X9 X1,X2,X5
Best Subset AIC X3,X4,X6,X7,X8,X9 X1,X2,X5
all possible / X3,X4,X6,X7,X8,X9 X1,X2,X5

Both models solved the problem of multicollinearity (VIF <10), and small P-values for F test. They don’t have serious violation of assumptions about the errors (There is no significant pattern on the plot of studentized residuals versus predicted values from the model with only one predictor. The partial regression plots do not show nonlinear patterns. The points follow approximately straight line on the qq plot). Both of Correlation between observed residuals and expected residuals under normality.The 6-predictor model got 0.9837263 P-value while the 6-predictor model got 0.9856766.

Model VIF F P-value(F) MSR MSE \(R_{adjusted}^2\) \(R_{Predict}^2\) P-value(t) Residuals Plots
3-4-7 <10 70.378 0.0000 21.188 0.301 0.878 0.854 Max=0.054 Good enough
3-4-6-7-8-9 <10 68.16 0.0000 11.265 0.165 0.933 0.908 Max=0.019 Good enough

However, comparing to the 3-variable model, the 6-variable model has a higher (about by 6%) adjusted R square and higher (about by 5%) prediction R-square, which means it shows stronger predictive capability. All the coeficients in 6-predictors model are statistically significant higher than 98% significance level (the maximum p-values are 0.019, respectively). In the 3-variable model, X7 get a high p-value (0.054) which means not significant at 5% significance level. If we change the p-value as the parameter of forward selection, the same model will happened between \(\alpha\) equal 0.6 and 0.17. Further, considering the context, X8 and X9 are variables of precipitation. The 3-predictor model mean the peak flow is irrelevant with precipitation. It doesn’t make sense. Therefore, the best model will be the model with 6 predictors.


  1. Provide complete ANOVA table for the best model. Provide partial sum of squares, estimated coefficients, standard errors, p-values, 95% Bonferroni joint confidence intervals for the coefficients of the best model. Provide in a tabular form clearly.
Model Summary
R 0.973 RMSE (Root Mean Square Error) 0.407
R-Squared 0.947 Coef. Var 6.385
Adj. R-Squared 0.933 MSE (Mean Square Error) 0.165
Pred R-Squared 0.908 MAE (Mean Absolute Error) 0.273
ANOVA
Sum of Squares DF Mean Square F p-value
Regression 67.591 6 11.265 68.16 \(1.717\times10^{-13}\)
Residual 3.801 23 0.165
Total 71.393 29
Parameter Estimates
model Estimated coefficients Partial SS Std. Error t test p-value 0.357 % 99.643 %
(Intercept) 2.69180 / 0.445 6.046 \(3.63\times10{-06}\) 1.37732232 4.00627202
X3 0.18384 5.37 0.032 5.698 \(8.41\times10{-06}\) 0.08857700 0.27911109
X4 0.10905 2.98 0.026 4.244 0.000306 0.03318876 0.18491189
X6 -0.36752 1.05 0.146 -2.526 0.018898 -0.79716475 0.06213151
X7 4.08497 1.87 1.213 3.367 0.002662 0.50312634 7.66681371
X8 0.61161 3.52 0.133 4.614 0.000122 0.22022202 1.00298907
X9 -0.44764 2.83 0.108 -4.135 0.000402 -0.76727751 -0.12799849

  1. How much variation in the response is explained by the best model after taking number of data and regression coefficients in to account?

By SSR equal 67.591 and SSE equal 3.801, the adjusted R-squared is 0.9329. About 93.29% variation in the response is explained by the best model.


  1. Report the PRESS statistic of the best model.

The value of PRESS is 6.538275. This model explains 90.8% of variation in predicting the peak rate of flow (in cfs) of water from six watersheds following storm episodes.


Discussion

  • Comparison VIF before and after elimination regression for each variable.

  • Indepency and grouping

Linear regression is one of the Watershed Hydrological Model. Singh (1972) used linear models with a logarithm transformation of the variables. We retained the following

\[\ln(Dependent\ Variable)=β_0+β_1\ln P+β_2\ln Q+β_3\ln F_P+β_4CC+Interactions+\varepsilon\]

where the dependent variables can either be total storm flow volume (\(Q_t\)) in mm, quick flow volume (\(Q_f\)) in mm or peak flow (\(Q_{pk}\)) in \(m^3 sec^{−1} km^{−2}\). Independent variables were storm rainfall (\(P\)) in \(mm\), initial flow (\(Q_i\)) in \(mm h^{−1}\), rainfall frequency (\(F_p\)), the inverse of rainfall duration, in \(h^{−1}\) and a dummy variable (\(CC\)) representing the treatment effect on basin. \(CC\) was 0 and 1 for the calibration (1967–1992) and treated (1994–1998) periods, respectively. \(β_0\) to \(β_4\) are regression coefficients of the independent variables. All interactions between the independent variables were also tested for significance at \(α=0.10\). (Guillemette et al., 2005)

Inspired by this theory, I divided the variables to 3 groups:

P F Q
Precipitation Time Terrain Surface
x8 x9 x1: Area of watershed (\(mi^2\)) x2: Area impervious to water (\(mi^2\))
Rainfall Time period during x3: Average slope of watershed (percent) x5: surface absorbency index
(inches) which rainfall exceeded x4: Longest stream flow in watershed x6: estimated soil storage capacity (inches of water)
¼ inch/hour (1000s of feet) x7: Infiltration rate of water into soil (inches/hour)

It is resonable that the best should contain at least on variable in each group. In this way, X8 and X9 are indispensable variables. We have also know that Setpwise forward method could neglect this context and is not recommended for this question.

  • Log and interaction
Transform interact Eliminate model number of perdictors number of vairbales R-squared
log(all) no stepboth p log(X4) 1 1 0.910
log(y) no Backward AIC X3,X4,X6,X7,X8,X9 6 6 0.947
log(all) no stepboth AIC log(X1),log(X3),log(X5),log(X6),log(X8),log(X9) 6 6 0.983
Mixed yes Backward AIC X1:X3,X1:X4,log(X8),log(X9) 7 5 0.984
log(all) no Backward AIC log(X1),log(X3),log(X5),log(X6),log(X8),log(X9) 6 6 0.986
log(y) yes Backward AIC omitted 15 9 0.994
log(all) yes Backward AIC omitted 22 9 0.998

It is obviously that the model of all-log with interaction is overfitting. Therefore, we should control the number of predictors. Since there are 6 predictors by elimination regression. A new model with higher R-squared and less than 7 predictors is better than the solusion in question (7). The best model should be

\[\ln(\hat y)=0.57120+0.72550\ln(X_1)+0.41866\ln(X_3)+1.25873\ln(X_5)-0.26702\ln(X_6)+1.62253\ln(X_8)-1.37489\ln(X_9)\]

Here I omitted the results of other combination.

[1]: Montgomery, D. C., Peck, E. A., & Vining, G. G. (2012). Introduction to linear regression analysis (Vol. 821). John Wiley & Sons.

[2]: Guillemette, F., Plamondon, A. P., Prévost, M., & Lévesque, D. (2005). Rainfall generated stormflow response to clearcutting a boreal forest: peak flow comparison with 50 world-wide basin studies. Journal of hydrology, 302(1-4), 137-153.

Code and output

(a) The matrix of scatterplots and the correlation matrix

library(tidyverse)
table_wf <- read_table2("WaterFlow.txt")
library(GGally)
ggpairs(data=table_wf[c(1:10)])

(c) The fitted full model

# build the model
model_wf_full <- lm(y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9, data=table_wf)
model_wf_full%>% summary()
## 
## Call:
## lm(formula = y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9, 
##     data = table_wf)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1404.21  -318.77    74.73   266.66  1274.30 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   292.56    4428.62   0.066   0.9480  
## X1           -203.14     410.27  -0.495   0.6259  
## X2           1055.78    9833.70   0.107   0.9156  
## X3            -49.24     156.20  -0.315   0.7558  
## X4            209.76     162.05   1.294   0.2103  
## X5            -10.20      51.09  -0.200   0.8438  
## X6            -24.56     303.53  -0.081   0.9363  
## X7            142.78    3288.44   0.043   0.9658  
## X8            511.71     209.74   2.440   0.0241 *
## X9           -301.87     172.00  -1.755   0.0945 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 609.3 on 20 degrees of freedom
## Multiple R-squared:  0.8214, Adjusted R-squared:  0.741 
## F-statistic: 10.22 on 9 and 20 DF,  p-value: 9.744e-06
Anova(model_wf_full)
## Anova Table (Type II tests)
## 
## Response: y
##            Sum Sq Df F value  Pr(>F)  
## X1          91022  1  0.2452 0.62589  
## X2           4279  1  0.0115 0.91557  
## X3          36893  1  0.0994 0.75585  
## X4         622091  1  1.6756 0.21025  
## X5          14790  1  0.0398 0.84381  
## X6           2430  1  0.0065 0.93632  
## X7            700  1  0.0019 0.96580  
## X8        2209825  1  5.9523 0.02414 *
## X9        1143622  1  3.0804 0.09455 .
## Residuals 7425127 20                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(c) ii Residual diagnostics

#Model Fit Assessment
ols_plot_diagnostics(model_wf_full)

# Part & Partial Correlations
ols_test_correlation(model_wf_full) # Correlation between observed residuals and expected residuals under normality.
## [1] 0.9710713
# Residual Normality Test
ols_test_normality(model_wf_full) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
## -----------------------------------------------
##        Test             Statistic       pvalue  
## -----------------------------------------------
## Shapiro-Wilk              0.9589         0.2898 
## Kolmogorov-Smirnov        0.1423         0.5314 
## Cramer-von Mises          2.5333         0.0000 
## Anderson-Darling          0.5169         0.1748 
## -----------------------------------------------

(c) iii The partial regression and nonlinear diagnostics

#Lack of Fit F Test
ols_pure_error_anova(lm(y~X8, data = table_wf))
## Lack of Fit F Test 
## ---------------
## Response :   y 
## Predictor:   X8 
## 
##                        Analysis of Variance Table                         
## -------------------------------------------------------------------------
##                 DF      Sum Sq        Mean Sq      F Value       Pr(>F)   
## -------------------------------------------------------------------------
## X8               1     4616882.92    4616882.92    5.795558    0.02290414 
## Residual        28    36951252.44    1319687.59                           
##  Lack of fit    21    31374881.28    1494041.97    1.875466     0.2003839 
##  Pure Error      7     5576371.17     796624.45                           
## -------------------------------------------------------------------------
# Variable Contributions
ols_plot_added_variable(model_wf_full)

# Residual Plus Component Plot
ols_plot_comp_plus_resid(model_wf_full)

(c) iv Collinearity diagnostics

# for full model
ols_vif_tol(model_wf_full)
## # A tibble: 9 x 3
##   Variables Tolerance    VIF
##   <chr>         <dbl>  <dbl>
## 1 X1          0.00982 102.  
## 2 X2          0.133     7.52
## 3 X3          0.0318   31.4 
## 4 X4          0.00946 106.  
## 5 X5          0.103     9.68
## 6 X6          0.433     2.31
## 7 X7          0.0487   20.5 
## 8 X8          0.182     5.50
## 9 X9          0.174     5.75

(d) The fitted log model

# build full log model

# table_wf_logy <- table_wf %>% mutate(logy=log(y))
# table_wf_logy$y <- NULL
# library(GGally)
# ggpairs(data=table_wf_logy[c(1:10)])

model_wf_full_log <- lm(log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9, data=table_wf)
summary(model_wf_full_log)
## 
## Call:
## lm(formula = log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + 
##     X9, data = table_wf)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.95298 -0.20764  0.01499  0.18100  0.67539 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.402256   3.150312   1.080 0.293006    
## X1          -0.013532   0.291845  -0.046 0.963477    
## X2          -1.023664   6.995235  -0.146 0.885120    
## X3           0.177966   0.111113   1.602 0.124908    
## X4           0.108788   0.115272   0.944 0.356560    
## X5          -0.009622   0.036341  -0.265 0.793898    
## X6          -0.389474   0.215916  -1.804 0.086345 .  
## X7           4.233475   2.339245   1.810 0.085387 .  
## X8           0.630070   0.149200   4.223 0.000418 ***
## X9          -0.462276   0.122350  -3.778 0.001181 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4334 on 20 degrees of freedom
## Multiple R-squared:  0.9474, Adjusted R-squared:  0.9237 
## F-statistic:    40 on 9 and 20 DF,  p-value: 7.513e-11
Anova(model_wf_full)
## Anova Table (Type II tests)
## 
## Response: y
##            Sum Sq Df F value  Pr(>F)  
## X1          91022  1  0.2452 0.62589  
## X2           4279  1  0.0115 0.91557  
## X3          36893  1  0.0994 0.75585  
## X4         622091  1  1.6756 0.21025  
## X5          14790  1  0.0398 0.84381  
## X6           2430  1  0.0065 0.93632  
## X7            700  1  0.0019 0.96580  
## X8        2209825  1  5.9523 0.02414 *
## X9        1143622  1  3.0804 0.09455 .
## Residuals 7425127 20                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Model Fit Assessment
# ols_plot_diagnostics(model_wf_full_log)

# Part & Partial Correlations
# ols_test_correlation(model_wf_full_log) # Correlation between observed residuals and expected residuals under normality.

# Residual Normality Test
# ols_test_normality(model_wf_full_log) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
library(dplyr)

(d) (2) Collinearity diagnostics

## Start:  AIC=-42.32
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X1    1    0.0004 3.7577 -44.322
## - X2    1    0.0040 3.7613 -44.293
## - X5    1    0.0132 3.7705 -44.220
## - X4    1    0.1673 3.9246 -43.018
## <none>              3.7573 -42.325
## - X3    1    0.4819 4.2392 -40.705
## - X6    1    0.6113 4.3686 -39.803
## - X7    1    0.6153 4.3726 -39.775
## - X9    1    2.6819 6.4392 -28.164
## - X8    1    3.3503 7.1076 -25.201
## 
## Step:  AIC=-44.32
## log(y) ~ X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X2    1    0.0110 3.7686 -46.234
## - X5    1    0.0267 3.7844 -46.110
## <none>              3.7577 -44.322
## - X6    1    1.0447 4.8023 -38.963
## - X7    1    1.5520 5.3097 -35.950
## - X4    1    1.8469 5.6046 -34.328
## - X9    1    2.8341 6.5918 -29.461
## - X8    1    3.4848 7.2425 -26.637
## - X3    1    5.0955 8.8532 -20.613
## 
## Step:  AIC=-46.23
## log(y) ~ X3 + X4 + X5 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X5    1    0.0327 3.8013 -47.975
## <none>              3.7686 -46.234
## - X6    1    1.0375 4.8061 -40.939
## - X4    1    1.8741 5.6428 -36.125
## - X7    1    1.9036 5.6722 -35.968
## - X9    1    2.8353 6.6040 -31.406
## - X8    1    3.4744 7.2430 -28.635
## - X3    1    5.1264 8.8951 -22.471
## 
## Step:  AIC=-47.98
## log(y) ~ X3 + X4 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## <none>              3.8013 -47.975
## - X6    1    1.0542 4.8555 -42.632
## - X7    1    1.8739 5.6752 -37.953
## - X9    1    2.8256 6.6270 -33.302
## - X4    1    2.9771 6.7784 -32.624
## - X8    1    3.5182 7.3195 -30.320
## - X3    1    5.3653 9.1666 -23.569
## Start:  AIC=-44.32
## log(y) ~ X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X2    1    0.0110 3.7686 -46.234
## - X5    1    0.0267 3.7844 -46.110
## <none>              3.7577 -44.322
## - X6    1    1.0447 4.8023 -38.963
## - X7    1    1.5520 5.3097 -35.950
## - X4    1    1.8469 5.6046 -34.328
## - X9    1    2.8341 6.5918 -29.461
## - X8    1    3.4848 7.2425 -26.637
## - X3    1    5.0955 8.8532 -20.613
## 
## Step:  AIC=-46.23
## log(y) ~ X3 + X4 + X5 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X5    1    0.0327 3.8013 -47.975
## <none>              3.7686 -46.234
## - X6    1    1.0375 4.8061 -40.939
## - X4    1    1.8741 5.6428 -36.125
## - X7    1    1.9036 5.6722 -35.968
## - X9    1    2.8353 6.6040 -31.406
## - X8    1    3.4744 7.2430 -28.635
## - X3    1    5.1264 8.8951 -22.471
## 
## Step:  AIC=-47.98
## log(y) ~ X3 + X4 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## <none>              3.8013 -47.975
## - X6    1    1.0542 4.8555 -42.632
## - X7    1    1.8739 5.6752 -37.953
## - X9    1    2.8256 6.6270 -33.302
## - X4    1    2.9771 6.7784 -32.624
## - X8    1    3.5182 7.3195 -30.320
## - X3    1    5.3653 9.1666 -23.569
## Start:  AIC=-44.29
## log(y) ~ X1 + X3 + X4 + X5 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X1    1    0.0073 3.7686 -46.234
## - X5    1    0.0370 3.7983 -45.999
## <none>              3.7613 -44.293
## - X4    1    0.3141 4.0754 -43.887
## - X6    1    0.7115 4.4729 -41.095
## - X3    1    0.7775 4.5388 -40.656
## - X7    1    1.2667 5.0280 -37.585
## - X9    1    2.7122 6.4735 -30.005
## - X8    1    3.4001 7.1614 -26.975
## 
## Step:  AIC=-46.23
## log(y) ~ X3 + X4 + X5 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X5    1    0.0327 3.8013 -47.975
## <none>              3.7686 -46.234
## - X6    1    1.0375 4.8061 -40.939
## - X4    1    1.8741 5.6428 -36.125
## - X7    1    1.9036 5.6722 -35.968
## - X9    1    2.8353 6.6040 -31.406
## - X8    1    3.4744 7.2430 -28.635
## - X3    1    5.1264 8.8951 -22.471
## 
## Step:  AIC=-47.98
## log(y) ~ X3 + X4 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## <none>              3.8013 -47.975
## - X6    1    1.0542 4.8555 -42.632
## - X7    1    1.8739 5.6752 -37.953
## - X9    1    2.8256 6.6270 -33.302
## - X4    1    2.9771 6.7784 -32.624
## - X8    1    3.5182 7.3195 -30.320
## - X3    1    5.3653 9.1666 -23.569
## Start:  AIC=-40.7
## log(y) ~ X1 + X2 + X4 + X5 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq     RSS     AIC
## - X7    1    0.1337  4.3729 -41.773
## - X6    1    0.1858  4.4250 -41.418
## <none>               4.2392 -40.705
## - X2    1    0.2995  4.5388 -40.656
## - X5    1    1.1498  5.3891 -35.505
## - X9    1    3.5480  7.7872 -24.461
## - X8    1    4.0900  8.3292 -22.443
## - X1    1    4.6140  8.8532 -20.613
## - X4    1   13.7481 17.9873   0.654
## 
## Step:  AIC=-41.77
## log(y) ~ X1 + X2 + X4 + X5 + X6 + X8 + X9
## 
##        Df Sum of Sq     RSS     AIC
## - X6    1    0.1369  4.5098 -42.849
## <none>               4.3729 -41.773
## - X2    1    0.6577  5.0306 -39.570
## - X5    1    1.0236  5.3965 -37.464
## - X9    1    3.4161  7.7890 -26.455
## - X8    1    3.9564  8.3293 -24.442
## - X1    1    4.7933  9.1662 -21.570
## - X4    1   13.8200 18.1929  -1.005
## 
## Step:  AIC=-42.85
## log(y) ~ X1 + X2 + X4 + X5 + X8 + X9
## 
##        Df Sum of Sq     RSS     AIC
## <none>               4.5098 -42.849
## - X2    1    0.6110  5.1208 -41.037
## - X5    1    0.8871  5.3969 -39.461
## - X9    1    3.2799  7.7896 -28.452
## - X8    1    3.8347  8.3444 -26.388
## - X1    1    5.0057  9.5155 -22.448
## - X4    1   15.9600 20.4698   0.533
## Start:  AIC=-43.02
## log(y) ~ X1 + X2 + X3 + X5 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq     RSS     AIC
## - X5    1    0.0457  3.9703 -44.671
## - X2    1    0.1508  4.0754 -43.887
## <none>               3.9246 -43.018
## - X1    1    1.6800  5.6046 -34.328
## - X6    1    2.1914  6.1160 -31.708
## - X9    1    2.5158  6.4404 -30.158
## - X8    1    3.1937  7.1183 -27.156
## - X7    1    4.3217  8.2463 -22.743
## - X3    1   14.0627 17.9873   0.654
## 
## Step:  AIC=-44.67
## log(y) ~ X1 + X2 + X3 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq     RSS     AIC
## - X2    1    0.1126  4.0829 -45.832
## <none>               3.9703 -44.671
## - X6    1    2.5195  6.4898 -31.929
## - X9    1    2.7581  6.7284 -30.846
## - X1    1    2.7838  6.7541 -30.731
## - X8    1    3.6308  7.6011 -27.187
## - X7    1    4.2769  8.2472 -24.740
## - X3    1   24.3256 28.2959  12.246
## 
## Step:  AIC=-45.83
## log(y) ~ X1 + X3 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq     RSS     AIC
## <none>               4.0829 -45.832
## - X6    1    2.4147  6.4976 -33.893
## - X9    1    2.6501  6.7330 -32.825
## - X1    1    2.6955  6.7784 -32.624
## - X8    1    3.5347  7.6176 -29.122
## - X7    1    5.2580  9.3409 -23.004
## - X3    1   25.3225 29.4054  11.399
## Start:  AIC=-44.22
## log(y) ~ X1 + X2 + X3 + X4 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X1    1    0.0139 3.7844 -46.110
## - X2    1    0.0279 3.7983 -45.999
## - X4    1    0.1998 3.9703 -44.671
## <none>              3.7705 -44.220
## - X6    1    0.7524 4.5229 -40.762
## - X7    1    1.1605 4.9309 -38.170
## - X3    1    1.6186 5.3891 -35.505
## - X9    1    2.8181 6.5886 -29.476
## - X8    1    3.5442 7.3147 -26.339
## 
## Step:  AIC=-46.11
## log(y) ~ X2 + X3 + X4 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X2    1    0.0170 3.8013 -47.975
## <none>              3.7844 -46.110
## - X6    1    1.0707 4.8551 -40.635
## - X7    1    1.5504 5.3348 -37.808
## - X9    1    2.8243 6.6087 -31.384
## - X4    1    2.9697 6.7541 -30.731
## - X8    1    3.5305 7.3149 -28.339
## - X3    1    5.3638 9.1482 -21.629
## 
## Step:  AIC=-47.98
## log(y) ~ X3 + X4 + X6 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## <none>              3.8013 -47.975
## - X6    1    1.0542 4.8555 -42.632
## - X7    1    1.8739 5.6752 -37.953
## - X9    1    2.8256 6.6270 -33.302
## - X4    1    2.9771 6.7784 -32.624
## - X8    1    3.5182 7.3195 -30.320
## - X3    1    5.3653 9.1666 -23.569
## Start:  AIC=-39.8
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X7 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X3    1    0.0564 4.4250 -41.418
## - X2    1    0.1043 4.4729 -41.095
## - X7    1    0.1349 4.5035 -40.891
## - X5    1    0.1543 4.5229 -40.762
## <none>              4.3686 -39.803
## - X1    1    0.4338 4.8023 -38.963
## - X4    1    1.7475 6.1160 -31.708
## - X9    1    2.6668 7.0353 -27.508
## - X8    1    3.1935 7.5620 -25.342
## 
## Step:  AIC=-41.42
## log(y) ~ X1 + X2 + X4 + X5 + X7 + X8 + X9
## 
##        Df Sum of Sq     RSS     AIC
## - X7    1    0.0848  4.5098 -42.849
## - X2    1    0.3043  4.7293 -41.423
## <none>               4.4250 -41.418
## - X5    1    0.9642  5.3892 -37.504
## - X9    1    3.3632  7.7882 -26.458
## - X8    1    3.9187  8.3437 -24.391
## - X1    1    4.6412  9.0662 -21.899
## - X4    1   16.0173 20.4423   2.492
## 
## Step:  AIC=-42.85
## log(y) ~ X1 + X2 + X4 + X5 + X8 + X9
## 
##        Df Sum of Sq     RSS     AIC
## <none>               4.5098 -42.849
## - X2    1    0.6110  5.1208 -41.037
## - X5    1    0.8871  5.3969 -39.461
## - X9    1    3.2799  7.7896 -28.452
## - X8    1    3.8347  8.3444 -26.388
## - X1    1    5.0057  9.5155 -22.448
## - X4    1   15.9600 20.4698   0.533
## Start:  AIC=-39.78
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X8 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X3    1    0.0003 4.3729 -41.773
## - X6    1    0.1309 4.5035 -40.891
## <none>              4.3726 -39.775
## - X5    1    0.5584 4.9309 -38.170
## - X2    1    0.6554 5.0280 -37.585
## - X1    1    0.9371 5.3097 -35.950
## - X9    1    3.0659 7.4385 -25.836
## - X8    1    3.6837 8.0563 -23.442
## - X4    1    3.8737 8.2463 -22.743
## 
## Step:  AIC=-41.77
## log(y) ~ X1 + X2 + X4 + X5 + X6 + X8 + X9
## 
##        Df Sum of Sq     RSS     AIC
## - X6    1    0.1369  4.5098 -42.849
## <none>               4.3729 -41.773
## - X2    1    0.6577  5.0306 -39.570
## - X5    1    1.0236  5.3965 -37.464
## - X9    1    3.4161  7.7890 -26.455
## - X8    1    3.9564  8.3293 -24.442
## - X1    1    4.7933  9.1662 -21.570
## - X4    1   13.8200 18.1929  -1.005
## 
## Step:  AIC=-42.85
## log(y) ~ X1 + X2 + X4 + X5 + X8 + X9
## 
##        Df Sum of Sq     RSS     AIC
## <none>               4.5098 -42.849
## - X2    1    0.6110  5.1208 -41.037
## - X5    1    0.8871  5.3969 -39.461
## - X9    1    3.2799  7.7896 -28.452
## - X8    1    3.8347  8.3444 -26.388
## - X1    1    5.0057  9.5155 -22.448
## - X4    1   15.9600 20.4698   0.533
## Start:  AIC=-25.2
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X9
## 
##        Df Sum of Sq    RSS     AIC
## - X9    1   0.00002 7.1076 -27.201
## - X4    1   0.01076 7.1183 -27.156
## - X2    1   0.05385 7.1614 -26.975
## - X1    1   0.13496 7.2425 -26.637
## - X5    1   0.20709 7.3147 -26.340
## - X6    1   0.45446 7.5620 -25.342
## <none>              7.1076 -25.201
## - X7    1   0.94872 8.0563 -23.442
## - X3    1   1.22165 8.3292 -22.443
## 
## Step:  AIC=-27.2
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7
## 
##        Df Sum of Sq    RSS     AIC
## - X4    1   0.01127 7.1189 -29.154
## - X2    1   0.05390 7.1615 -28.974
## - X1    1   0.13699 7.2446 -28.628
## - X5    1   0.20769 7.3153 -28.337
## - X6    1   0.45911 7.5667 -27.323
## <none>              7.1076 -27.201
## - X7    1   0.95446 8.0621 -25.421
## - X3    1   1.25421 8.3618 -24.326
## 
## Step:  AIC=-29.15
## log(y) ~ X1 + X2 + X3 + X5 + X6 + X7
## 
##        Df Sum of Sq     RSS      AIC
## - X2    1    0.1409  7.2598 -30.5654
## - X5    1    0.4878  7.6066 -29.1654
## <none>               7.1189 -29.1535
## - X6    1    1.1498  8.2686 -26.6619
## - X1    1    2.6634  9.7823 -21.6187
## - X7    1    3.8818 11.0007 -18.0972
## - X3    1   17.0862 24.2051   5.5609
## 
## Step:  AIC=-30.57
## log(y) ~ X1 + X3 + X5 + X6 + X7
## 
##        Df Sum of Sq     RSS      AIC
## - X5    1    0.3665  7.6263 -31.0878
## <none>               7.2598 -30.5654
## - X6    1    1.1128  8.3726 -28.2870
## - X1    1    2.6550  9.9148 -23.2150
## - X7    1    4.5672 11.8270 -17.9245
## - X3    1   19.3750 26.6348   6.4306
## 
## Step:  AIC=-31.09
## log(y) ~ X1 + X3 + X6 + X7
## 
##        Df Sum of Sq    RSS     AIC
## <none>               7.626 -31.088
## - X6    1    1.5290  9.155 -27.606
## - X1    1    2.7319 10.358 -23.902
## - X7    1    4.8252 12.452 -18.381
## - X3    1   26.0905 33.717  11.504
## Start:  AIC=-28.16
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8
## 
##        Df Sum of Sq    RSS     AIC
## - X4    1   0.00125 6.4404 -30.158
## - X2    1   0.03429 6.4735 -30.005
## - X5    1   0.14939 6.5886 -29.476
## - X1    1   0.15258 6.5918 -29.461
## <none>              6.4392 -28.164
## - X6    1   0.59615 7.0353 -27.508
## - X8    1   0.66842 7.1076 -27.201
## - X7    1   0.99933 7.4385 -25.836
## - X3    1   1.34802 7.7872 -24.462
## 
## Step:  AIC=-30.16
## log(y) ~ X1 + X2 + X3 + X5 + X6 + X7 + X8
## 
##        Df Sum of Sq     RSS     AIC
## - X2    1    0.0670  6.5074 -31.848
## - X5    1    0.2880  6.7284 -30.846
## <none>               6.4404 -30.158
## - X8    1    0.6784  7.1189 -29.153
## - X6    1    1.2983  7.7387 -26.649
## - X1    1    2.0749  8.5153 -23.780
## - X7    1    3.5965 10.0369 -18.848
## - X3    1   16.2255 22.6659   5.590
## 
## Step:  AIC=-31.85
## log(y) ~ X1 + X3 + X5 + X6 + X7 + X8
## 
##        Df Sum of Sq     RSS     AIC
## - X5    1    0.2255  6.7330 -32.825
## <none>               6.5074 -31.848
## - X8    1    0.7523  7.2598 -30.565
## - X6    1    1.2782  7.7856 -28.468
## - X1    1    2.1911  8.6986 -25.141
## - X7    1    4.5538 11.0612 -17.933
## - X3    1   18.9765 25.4839   7.105
## 
## Step:  AIC=-32.83
## log(y) ~ X1 + X3 + X6 + X7 + X8
## 
##        Df Sum of Sq    RSS     AIC
## <none>               6.733 -32.825
## - X8    1    0.8934  7.626 -31.088
## - X6    1    1.6647  8.398 -28.197
## - X1    1    2.4940  9.227 -25.372
## - X7    1    4.7585 11.491 -18.788
## - X3    1   26.5284 33.261  13.096
# Compare vif
ols_vif_tol(model_wf_full_log)
## # A tibble: 9 x 3
##   Variables Tolerance    VIF
##   <chr>         <dbl>  <dbl>
## 1 X1          0.00982 102.  
## 2 X2          0.133     7.52
## 3 X3          0.0318   31.4 
## 4 X4          0.00946 106.  
## 5 X5          0.103     9.68
## 6 X6          0.433     2.31
## 7 X7          0.0487   20.5 
## 8 X8          0.182     5.50
## 9 X9          0.174     5.75
ols_vif_tol(model_wf_aic_log)
## # A tibble: 6 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X3            0.332  3.01
## 2 X4            0.167  5.97
## 3 X6            0.839  1.19
## 4 X7            0.159  6.28
## 5 X8            0.202  4.94
## 6 X9            0.195  5.12
ols_vif_tol(model_wf_rm1_log)
## # A tibble: 8 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X2            0.245  4.08
## 2 X3            0.318  3.14
## 3 X4            0.115  8.70
## 4 X5            0.283  3.54
## 5 X6            0.717  1.39
## 6 X7            0.118  8.46
## 7 X8            0.190  5.27
## 8 X9            0.185  5.41
ols_vif_tol(model_wf_rm1_aic_log)
## # A tibble: 6 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X3            0.332  3.01
## 2 X4            0.167  5.97
## 3 X6            0.839  1.19
## 4 X7            0.159  6.28
## 5 X8            0.202  4.94
## 6 X9            0.195  5.12
ols_vif_tol(model_wf_rm2_log)
## # A tibble: 8 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1           0.0181 55.2 
## 2 X3           0.0583 17.2 
## 3 X4           0.0148 67.4 
## 4 X5           0.163   6.13
## 5 X6           0.543   1.84
## 6 X7           0.114   8.79
## 7 X8           0.183   5.46
## 8 X9           0.175   5.72
ols_vif_tol(model_wf_rm2_aic_log)
## # A tibble: 6 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X3            0.332  3.01
## 2 X4            0.167  5.97
## 3 X6            0.839  1.19
## 4 X7            0.159  6.28
## 5 X8            0.202  4.94
## 6 X9            0.195  5.12
ols_vif_tol(model_wf_rm3_log)
## # A tibble: 8 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1           0.0983 10.2 
## 2 X2           0.243   4.11
## 3 X4           0.113   8.87
## 4 X5           0.272   3.68
## 5 X6           0.767   1.30
## 6 X7           0.206   4.85
## 7 X8           0.190   5.26
## 8 X9           0.187   5.36
ols_vif_tol(model_wf_rm3_aic_log)
## # A tibble: 6 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1            0.119  8.39
## 2 X2            0.313  3.19
## 3 X4            0.123  8.12
## 4 X5            0.343  2.92
## 5 X8            0.209  4.79
## 6 X9            0.205  4.88
ols_vif_tol(model_wf_rm4_log)
## # A tibble: 8 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1            0.119  8.38
## 2 X2            0.209  4.79
## 3 X3            0.379  2.64
## 4 X5            0.187  5.35
## 5 X6            0.836  1.20
## 6 X7            0.165  6.05
## 7 X8            0.187  5.35
## 8 X9            0.183  5.45
ols_vif_tol(model_wf_rm4_aic_log)
## # A tibble: 6 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1            0.279  3.58
## 2 X3            0.768  1.30
## 3 X6            0.917  1.09
## 4 X7            0.251  3.99
## 5 X8            0.202  4.94
## 6 X9            0.196  5.10
ols_vif_tol(model_wf_rm5_log)
## # A tibble: 8 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1           0.0269 37.2 
## 2 X2           0.210   4.77
## 3 X3           0.0836 12.0 
## 4 X4           0.0171 58.4 
## 5 X6           0.485   2.06
## 6 X7           0.0774 12.9 
## 7 X8           0.200   5.01
## 8 X9           0.190   5.25
ols_vif_tol(model_wf_rm5_aic_log)
## # A tibble: 6 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X3            0.332  3.01
## 2 X4            0.167  5.97
## 3 X6            0.839  1.19
## 4 X7            0.159  6.28
## 5 X8            0.202  4.94
## 6 X9            0.195  5.12
ols_vif_tol(model_wf_rm6_log)
## # A tibble: 8 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1           0.0162 61.5 
## 2 X2           0.167   6.00
## 3 X3           0.0563 17.8 
## 4 X4           0.0182 54.8 
## 5 X5           0.116   8.64
## 6 X7           0.0832 12.0 
## 7 X8           0.182   5.49
## 8 X9           0.174   5.75
ols_vif_tol(model_wf_rm6_aic_log)
## # A tibble: 6 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1            0.119  8.39
## 2 X2            0.313  3.19
## 3 X4            0.123  8.12
## 4 X5            0.343  2.92
## 5 X8            0.209  4.79
## 6 X9            0.205  4.88
ols_vif_tol(model_wf_rm7_log)
## # A tibble: 8 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1           0.0238 42.0 
## 2 X2           0.310   3.22
## 3 X3           0.135   7.42
## 4 X4           0.0321 31.1 
## 5 X5           0.164   6.09
## 6 X6           0.740   1.35
## 7 X8           0.184   5.45
## 8 X9           0.177   5.66
ols_vif_tol(model_wf_rm7_aic_log)
## # A tibble: 6 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1            0.119  8.39
## 2 X2            0.313  3.19
## 3 X4            0.123  8.12
## 4 X5            0.343  2.92
## 5 X8            0.209  4.79
## 6 X9            0.205  4.88
ols_vif_tol(model_wf_rm8_log)
## # A tibble: 8 x 3
##   Variables Tolerance    VIF
##   <chr>         <dbl>  <dbl>
## 1 X1          0.0103   97.5 
## 2 X2          0.134     7.46
## 3 X3          0.0333   30.0 
## 4 X4          0.00973 103.  
## 5 X5          0.114     8.81
## 6 X6          0.435     2.30
## 7 X7          0.0492   20.3 
## 8 X9          0.879     1.14
ols_vif_tol(model_wf_rm8_aic_log)
## # A tibble: 4 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1            0.281  3.56
## 2 X3            0.773  1.29
## 3 X6            0.949  1.05
## 4 X7            0.252  3.96
ols_vif_tol(model_wf_rm9_log)
## # A tibble: 8 x 3
##   Variables Tolerance    VIF
##   <chr>         <dbl>  <dbl>
## 1 X1          0.0104   95.8 
## 2 X2          0.134     7.48
## 3 X3          0.0341   29.3 
## 4 X4          0.00998 100.  
## 5 X5          0.113     8.83
## 6 X6          0.433     2.31
## 7 X7          0.0495   20.2 
## 8 X8          0.919     1.09
ols_vif_tol(model_wf_rm9_aic_log)
## # A tibble: 5 x 3
##   Variables Tolerance   VIF
##   <chr>         <dbl> <dbl>
## 1 X1            0.280  3.58
## 2 X3            0.771  1.30
## 3 X6            0.945  1.06
## 4 X7            0.252  3.97
## 5 X8            0.963  1.04

(d) (3) Variable selection

library(huxtable)
huxreg(model_wf_rm1_log, model_wf_rm2_log, model_wf_rm3_log, model_wf_rm4_log, model_wf_rm5_log, model_wf_rm6_log, model_wf_rm7_log, model_wf_rm8_log, model_wf_rm9_log, model_wf_full_log)
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
(Intercept) 3.280     3.703     7.731 *** 1.200     2.581 *** 5.523     7.487 **  -0.182  -0.127   3.402    
(1.690)    (2.333)    (1.678)    (2.110)    (0.547)    (3.075)    (2.314)    (4.072) (3.844)  (3.150)   
X2 -1.243              6.526     -5.001     -2.144     4.655     8.550     -3.730  -2.980   -1.024    
(5.025)             (5.358)    (5.568)    (5.445)    (6.574)    (4.819)    (9.350) (8.912)  (6.995)   
X3 0.183 *** 0.167 *            0.278 *** 0.201 **  0.046     0.002     0.277  0.287 * 0.178    
(0.034)    (0.080)             (0.032)    (0.067)    (0.088)    (0.057)    (0.146) (0.137)  (0.111)   
X4 0.104 **  0.119     0.286 ***          0.088     0.253 **  0.284 *** 0.027  0.009   0.109    
(0.032)    (0.090)    (0.035)             (0.084)    (0.087)    (0.066)    (0.153) (0.143)  (0.115)   
X5 -0.008     -0.013     -0.055 *   0.013              -0.031     -0.050     0.036  0.031   -0.010    
(0.021)    (0.028)    (0.023)    (0.027)             (0.036)    (0.030)    (0.047) (0.044)  (0.036)   
X6 -0.396 *   -0.375     -0.161     -0.531 **  -0.408              -0.138     -0.335  -0.385   -0.389    
(0.164)    (0.188)    (0.168)    (0.155)    (0.199)             (0.174)    (0.289) (0.276)  (0.216)   
X7 4.317 **  3.975 *   0.959     6.088 *** 4.611 *   1.517              5.230  5.352   4.233    
(1.466)    (1.495)    (1.178)    (1.266)    (1.814)    (1.884)             (3.124) (2.965)  (2.339)   
X8 0.629 *** 0.632 *** 0.680 *** 0.606 *** 0.618 *** 0.614 *** 0.657 ***       0.125   0.630 ***
(0.142)    (0.145)    (0.151)    (0.147)    (0.139)    (0.157)    (0.156)          (0.085)  (0.149)   
X9 -0.461 *** -0.464 *** -0.513 *** -0.436 **  -0.453 *** -0.461 **  -0.490 *** 0.001         -0.462 ** 
(0.116)    (0.119)    (0.122)    (0.119)    (0.114)    (0.129)    (0.128)    (0.073)        (0.122)   
X1          -0.042     -0.457 *** 0.250 **  0.048     -0.345     -0.418 *   0.242  0.255   -0.014    
         (0.210)    (0.096)    (0.083)    (0.173)    (0.239)    (0.197)    (0.383) (0.362)  (0.292)   
N 30         30         30         30         30         30         30         30      30       30        
R2 0.947     0.947     0.941     0.945     0.947     0.939     0.939     0.900  0.910   0.947    
logLik -11.407     -11.422     -13.216     -12.059     -11.458     -13.667     -13.681     -20.968  -19.486   -11.406    
AIC 42.815     42.843     46.432     44.118     42.916     47.333     47.361     61.935  58.972   44.811    
*** p < 0.001; ** p < 0.01; * p < 0.05.
huxreg(model_wf_rm1_aic_log, model_wf_rm2_aic_log, model_wf_rm3_aic_log, model_wf_rm4_aic_log, model_wf_rm5_aic_log, model_wf_rm6_aic_log, model_wf_rm7_aic_log, model_wf_rm8_aic_log, model_wf_rm9_aic_log, model_wf_aic_log)
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
(Intercept) 2.692 *** 2.692 *** 6.882 *** 2.307 *** 2.692 *** 6.882 *** 6.882 *** 2.587 *** 2.225 *** 2.692 ***
(0.445)    (0.445)    (1.432)    (0.410)    (0.445)    (1.432)    (1.432)    (0.494)    (0.515)    (0.445)   
X3 0.184 *** 0.184 ***          0.263 *** 0.184 ***                   0.266 *** 0.268 *** 0.184 ***
(0.032)    (0.032)             (0.022)    (0.032)                      (0.029)    (0.028)    (0.032)   
X4 0.109 *** 0.109 *** 0.294 ***          0.109 *** 0.294 *** 0.294 ***                   0.109 ***
(0.026)    (0.026)    (0.033)             (0.026)    (0.033)    (0.033)                      (0.026)   
X6 -0.368 *   -0.368 *            -0.532 **  -0.368 *                     -0.416 *   -0.435 *   -0.368 *  
(0.146)    (0.146)             (0.144)    (0.146)                      (0.186)    (0.179)    (0.146)   
X7 4.085 **  4.085 **           5.453 *** 4.085 **                    5.209 *** 5.174 *** 4.085 ** 
(1.213)    (1.213)             (1.002)    (1.213)                      (1.310)    (1.256)    (1.213)   
X8 0.612 *** 0.612 *** 0.629 *** 0.613 *** 0.612 *** 0.629 *** 0.629 ***          0.141     0.612 ***
(0.133)    (0.133)    (0.142)    (0.137)    (0.133)    (0.142)    (0.142)             (0.079)    (0.133)   
X9 -0.448 *** -0.448 *** -0.471 *** -0.433 *** -0.448 *** -0.471 *** -0.471 ***                   -0.448 ***
(0.108)    (0.108)    (0.115)    (0.112)    (0.108)    (0.115)    (0.115)                      (0.108)   
X1                   -0.432 *** 0.207 ***          -0.432 *** -0.432 *** 0.208 **  0.199 **          
                  (0.086)    (0.053)             (0.086)    (0.086)    (0.070)    (0.067)            
X2                   8.217                       8.217     8.217                               
                  (4.655)                      (4.655)    (4.655)                              
X5                   -0.043 *                     -0.043 *   -0.043 *                             
                  (0.020)                      (0.020)    (0.020)                              
N 30         30         30         30         30         30         30         30         30         30        
R2 0.947     0.947     0.937     0.943     0.947     0.937     0.937     0.893     0.906     0.947    
logLik -11.581     -11.581     -14.144     -12.652     -11.581     -14.144     -14.144     -22.024     -20.155     -11.581    
AIC 39.161     39.161     44.288     41.305     39.161     44.288     44.288     56.049     54.311     39.161    
*** p < 0.001; ** p < 0.01; * p < 0.05.

(d) (4) Forward selection

Stepwise Forward Regression for full model

# Stepwise Forward Regression based on p values (use a=0.15) #
ols_step_forward_p(model_wf_full_log, penter = 0.15)

# Stepwise AIC Forward Regression #
ols_step_forward_aic(model_wf_full_log)

Stepwise Forward Regression for X4 eliminated model

# Stepwise Forward Regression based on p values (use a=0.15) #
ols_step_forward_p(model_wf_rm4_log, penter = 0.15)
# Stepwise AIC Forward Regression #
ols_step_forward_aic(model_wf_rm4_log)

Stepwise Forward Regression for X1 eliminated model

# Stepwise Forward Regression based on p values (use a=0.15) #
ols_step_forward_p(model_wf_rm1_log, penter = 0.15)
# Stepwise AIC Forward Regression #
ols_step_forward_aic(model_wf_rm1_log)

(d) (5) Backward selection

Stepwise Backward Regression for full model

# Stepwise Backward Regression based on p values (use a=0.05) #
ols_step_backward_p(model_wf_full_log, penter = 0.05)
## Backward Elimination Method 
## ---------------------------
## 
## Candidate Terms: 
## 
## 1 . X1 
## 2 . X2 
## 3 . X3 
## 4 . X4 
## 5 . X5 
## 6 . X6 
## 7 . X7 
## 8 . X8 
## 9 . X9 
## 
## We are eliminating variables based on p value...
## 
## Variables Removed: 
## 
## - X1 
## - X2 
## - X5 
## 
## No more variables satisfy the condition of p value = 0.3
## 
## 
## Final Model Output 
## ------------------
## 
##                         Model Summary                         
## -------------------------------------------------------------
## R                       0.973       RMSE               0.407 
## R-Squared               0.947       Coef. Var          6.385 
## Adj. R-Squared          0.933       MSE                0.165 
## Pred R-Squared          0.908       MAE                0.273 
## -------------------------------------------------------------
##  RMSE: Root Mean Square Error 
##  MSE: Mean Square Error 
##  MAE: Mean Absolute Error 
## 
##                                ANOVA                                
## -------------------------------------------------------------------
##                Sum of                                              
##               Squares        DF    Mean Square      F         Sig. 
## -------------------------------------------------------------------
## Regression     67.591         6         11.265     68.16    0.0000 
## Residual        3.801        23          0.165                     
## Total          71.393        29                                    
## -------------------------------------------------------------------
## 
##                                   Parameter Estimates                                    
## ----------------------------------------------------------------------------------------
##       model      Beta    Std. Error    Std. Beta      t        Sig      lower     upper 
## ----------------------------------------------------------------------------------------
## (Intercept)     2.692         0.445                  6.046    0.000     1.771     3.613 
##          X3     0.184         0.032        0.476     5.698    0.000     0.117     0.251 
##          X4     0.109         0.026        0.499     4.244    0.000     0.056     0.162 
##          X6    -0.368         0.146       -0.133    -2.526    0.019    -0.669    -0.066 
##          X7     4.085         1.213        0.406     3.367    0.003     1.575     6.595 
##          X8     0.612         0.133        0.493     4.614    0.000     0.337     0.886 
##          X9    -0.448         0.108       -0.450    -4.135    0.000    -0.672    -0.224 
## ----------------------------------------------------------------------------------------
## 
## 
##                           Elimination Summary                           
## -----------------------------------------------------------------------
##         Variable                  Adj.                                     
## Step    Removed     R-Square    R-Square     C(p)       AIC       RMSE     
## -----------------------------------------------------------------------
##    1    X1            0.9474      0.9273    8.0021    42.8146    0.4230    
##    2    X2            0.9472      0.9304    6.0604    40.9019    0.4139    
##    3    X5            0.9468      0.9329    4.2345    39.1611    0.4065    
## -----------------------------------------------------------------------
# Stepwise AIC Backward Regression #
ols_step_backward_aic(model_wf_full_log)
## Backward Elimination Method 
## ---------------------------
## 
## Candidate Terms: 
## 
## 1 . X1 
## 2 . X2 
## 3 . X3 
## 4 . X4 
## 5 . X5 
## 6 . X6 
## 7 . X7 
## 8 . X8 
## 9 . X9 
## 
## 
## Variables Removed: 
## 
## - X1 
## - X2 
## - X5 
## 
## No more variables to be removed.
## 
## 
##                   Backward Elimination Summary                   
## ---------------------------------------------------------------
## Variable       AIC       RSS     Sum Sq     R-Sq      Adj. R-Sq 
## ---------------------------------------------------------------
## Full Model    44.811    3.757    67.635    0.94737      0.92369 
## X1            42.815    3.758    67.635    0.94737      0.92731 
## X2            40.902    3.769    67.624    0.94721      0.93042 
## X5            39.161    3.801    67.591    0.94675      0.93286 
## ---------------------------------------------------------------

Stepwise Backward Regression for X4 eliminated model

# Stepwise Backward Regression based on p values (use a=0.05) #
ols_step_backward_p(model_wf_rm4_log, penter = 0.05)
# Stepwise AIC Backward Regression #
ols_step_backward_aic(model_wf_rm4_log)

Stepwise Backward Regression for X1 eliminated model

# Stepwise Backward Regression based on p values (use a=0.05) #
ols_step_backward_p(model_wf_rm1_log, penter = 0.05)
# Stepwise AIC Backward Regression #
ols_step_backward_aic(model_wf_rm1_log)

(d) (6) Best Subset Regression

# For full model #
k <- ols_step_best_subset(model_wf_full_log)
k
mindex n predictors rsquare adjr predrsq cp aic sbic sbc msep fpe apc hsp
1 1 X4 0.803 0.796 0.772 48.9  68.4 -19.8 72.6 0.538 0.536 0.225  0.0186 
2 2 X3 X4 0.873 0.864 0.844 24.2  57.2 -30.5 62.8 0.373 0.369 0.155  0.0129 
3 3 X3 X4 X7 0.89  0.878 0.854 19.7  54.8 -32.7 61.8 0.348 0.341 0.143  0.012  
4 4 X1 X4 X8 X9 0.921 0.908 0.886 10.1  47   -38.1 55.4 0.272 0.264 0.111  0.00941
5 5 X3 X4 X7 X8 X9 0.932 0.918 0.892 7.85 44.5 -38.8 54.3 0.255 0.243 0.102  0.0088 
6 6 X3 X4 X6 X7 X8 X9 0.947 0.933 0.908 4.23 39.2 -39.7 50.4 0.217 0.204 0.0857 0.00751
7 7 X3 X4 X5 X6 X7 X8 X9 0.947 0.93  0.902 6.06 40.9 -36.8 53.5 0.236 0.217 0.0912 0.00816
8 8 X2 X3 X4 X5 X6 X7 X8 X9 0.947 0.927 0.896 8    42.8 -33.8 56.8 0.259 0.233 0.0977 0.00895
9 9 X1 X2 X3 X4 X5 X6 X7 X8 X9 0.947 0.924 0.886 10    44.8 -30.8 60.2 0.286 0.25  0.105  0.00989
plot(k)

# For X4 eliminated model #
# k <- ols_step_best_subset(model_wf_rm4_log)
# k
# plot(k)

# For X1 eliminated model #
# k <- ols_step_best_subset(model_wf_rm1_log)
# k
# plot(k)

(d) (7) Models Comparison

  • Model 437896
# build model 437896
model_wf_437896_log <- lm(log(y) ~ X4 + X3 + X7 + X8 + X9 + X6, data=table_wf)
ols_regress(model_wf_437896_log)
##                         Model Summary                         
## -------------------------------------------------------------
## R                       0.973       RMSE               0.407 
## R-Squared               0.947       Coef. Var          6.385 
## Adj. R-Squared          0.933       MSE                0.165 
## Pred R-Squared          0.908       MAE                0.273 
## -------------------------------------------------------------
##  RMSE: Root Mean Square Error 
##  MSE: Mean Square Error 
##  MAE: Mean Absolute Error 
## 
##                                ANOVA                                
## -------------------------------------------------------------------
##                Sum of                                              
##               Squares        DF    Mean Square      F         Sig. 
## -------------------------------------------------------------------
## Regression     67.591         6         11.265     68.16    0.0000 
## Residual        3.801        23          0.165                     
## Total          71.393        29                                    
## -------------------------------------------------------------------
## 
##                                   Parameter Estimates                                    
## ----------------------------------------------------------------------------------------
##       model      Beta    Std. Error    Std. Beta      t        Sig      lower     upper 
## ----------------------------------------------------------------------------------------
## (Intercept)     2.692         0.445                  6.046    0.000     1.771     3.613 
##          X4     0.109         0.026        0.499     4.244    0.000     0.056     0.162 
##          X3     0.184         0.032        0.476     5.698    0.000     0.117     0.251 
##          X7     4.085         1.213        0.406     3.367    0.003     1.575     6.595 
##          X8     0.612         0.133        0.493     4.614    0.000     0.337     0.886 
##          X9    -0.448         0.108       -0.450    -4.135    0.000    -0.672    -0.224 
##          X6    -0.368         0.146       -0.133    -2.526    0.019    -0.669    -0.066 
## ----------------------------------------------------------------------------------------
confint(model_wf_437896_log, level = 1-(0.05/7)) # Bonferroni joint confidence interval #
##                 0.357 %    99.643 %
## (Intercept)  1.37732232  4.00627202
## X4           0.03318876  0.18491189
## X3           0.08857700  0.27911109
## X7           0.50312634  7.66681371
## X8           0.22022202  1.00298907
## X9          -0.76727751 -0.12799849
## X6          -0.79716475  0.06213151
# Collinearity Diagnostics #
ols_vif_tol(model_wf_437896_log)
Variables Tolerance VIF
X4 0.167 5.97
X3 0.332 3.01
X7 0.159 6.28
X8 0.202 4.94
X9 0.195 5.12
X6 0.839 1.19
#Model Fit Assessment
ols_plot_diagnostics(model_wf_437896_log)

# Part & Partial Correlations
ols_test_correlation(model_wf_437896_log) # Correlation between observed residuals and expected residuals under normality.
## [1] 0.9837263
# Residual Normality Test
ols_test_normality(model_wf_437896_log) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
## -----------------------------------------------
##        Test             Statistic       pvalue  
## -----------------------------------------------
## Shapiro-Wilk              0.9728         0.6175 
## Kolmogorov-Smirnov        0.0997         0.8982 
## Cramer-von Mises          4.8429         0.0000 
## Anderson-Darling          0.2996         0.5612 
## -----------------------------------------------
# Variable Contributions
ols_plot_added_variable(model_wf_437896_log)

# Residual Plus Component Plot
ols_plot_comp_plus_resid(model_wf_437896_log)

  • Model 437
# build model 437
model_wf_437_log <- lm(log(y) ~ X4 + X3 + X7, data=table_wf)
ols_regress(model_wf_437_log)
##                         Model Summary                         
## -------------------------------------------------------------
## R                       0.944       RMSE               0.549 
## R-Squared               0.890       Coef. Var          8.618 
## Adj. R-Squared          0.878       MSE                0.301 
## Pred R-Squared          0.854       MAE                0.414 
## -------------------------------------------------------------
##  RMSE: Root Mean Square Error 
##  MSE: Mean Square Error 
##  MAE: Mean Absolute Error 
## 
##                                ANOVA                                
## -------------------------------------------------------------------
##                Sum of                                              
##               Squares        DF    Mean Square      F         Sig. 
## -------------------------------------------------------------------
## Regression     63.565         3         21.188    70.378    0.0000 
## Residual        7.828        26          0.301                     
## Total          71.393        29                                    
## -------------------------------------------------------------------
## 
##                                  Parameter Estimates                                  
## -------------------------------------------------------------------------------------
##       model     Beta    Std. Error    Std. Beta      t       Sig      lower    upper 
## -------------------------------------------------------------------------------------
## (Intercept)    2.872         0.547                 5.254    0.000     1.748    3.995 
##          X4    0.122         0.033        0.559    3.730    0.001     0.055    0.189 
##          X3    0.168         0.040        0.435    4.165    0.000     0.085    0.251 
##          X7    3.106         1.537        0.309    2.021    0.054    -0.053    6.266 
## -------------------------------------------------------------------------------------
# Collinearity Diagnostics #
ols_vif_tol(model_wf_437_log)
Variables Tolerance VIF
X4 0.188 5.32
X3 0.386 2.59
X7 0.181 5.53
#Model Fit Assessment
ols_plot_diagnostics(model_wf_437_log)

# Part & Partial Correlations
ols_test_correlation(model_wf_437_log) # Correlation between observed residuals and expected residuals under normality.
## [1] 0.9856766
# Residual Normality Test
ols_test_normality(model_wf_437_log) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
## -----------------------------------------------
##        Test             Statistic       pvalue  
## -----------------------------------------------
## Shapiro-Wilk              0.9765         0.7267 
## Kolmogorov-Smirnov        0.1033         0.8736 
## Cramer-von Mises          3.1908         0.0000 
## Anderson-Darling          0.3511         0.4469 
## -----------------------------------------------
# Variable Contributions
ols_plot_added_variable(model_wf_437_log)

# Residual Plus Component Plot
ols_plot_comp_plus_resid(model_wf_437_log)

# Check PRESS Statistic
ols_press(model_wf_full)
## [1] 15880486
ols_press(model_wf_full_log)
## [1] 8.136733
ols_press(model_wf_437896_log)
## [1] 6.538275
ols_press(model_wf_437_log)
## [1] 10.43262
# ols_press(model_wf_137689_log)

# prediction power
ols_pred_rsq(model_wf_full)
## [1] 0.6179649
ols_pred_rsq(model_wf_full_log)
## [1] 0.8860283
ols_pred_rsq(model_wf_437896_log)
## [1] 0.908418
ols_pred_rsq(model_wf_437_log)
## [1] 0.8538697
# ols_pred_rsq(model_wf_137689_log)

More

  • Other Models
# build X1*X8 eliminated log model
model_wf_18rm4_log <- lm(log(y) ~ X1*X8 + X3 + X6 + X7 + X9, data=table_wf)

# build X1*X8 eliminated log model
table_wf_resi <- table_wf%>% mutate(x1t8=X1*X8)
model_wf_1time8_log <- lm(log(y) ~ x1t8 + X3 + X6 + X7+ X9 , data=table_wf_resi)

# build X1*X4 eliminated log model
table_wf_resi <- table_wf%>% mutate(x1t4=X1*X4)
model_wf_1time4_log <- lm(log(y) ~ x1t4 + X3 + X6 + X7+ X8+ X9, data=table_wf_resi)
summary(model_wf_1time4_log)

# build X1/X4 eliminated log model
table_wf_resi <- table_wf%>% mutate(x14=X1/X4)
model_wf_1per4_log <- lm(log(y) ~ x14 + X3 + X6 + X7+ X8+ X9, data=table_wf_resi)

# build X4*X3 eliminated log model
model_wf_43rm1_log <- lm(log(y) ~ X9 + X4*X3 + X6 + X7 + X8 , data=table_wf)

# build X4*X9 eliminated log model
model_wf_49rm1_log <- lm(log(y) ~ X3 + X4*X9 + X6 + X7 + X8 , data=table_wf)

# build X4*X9 eliminated log model
model_wf_48rm1_log <- lm(log(y) ~ X3 + X4*X8 + X6 + X7 + X9 , data=table_wf)

# build X4*X9 eliminated log model
model_wf_47rm1_log <- lm(log(y) ~ X3 + X4*X7 + X6 + X9 + X8 , data=table_wf)

# build X4/X9 eliminated log model
table_wf_resi <- table_wf%>% mutate(x4p9=X4/X9)
model_wf_4per9_log <- lm(log(y) ~ X3 + x4p9 + X6 + X7 + X8 , data=table_wf_resi)

# build X3/X4vX8*X9 eliminated log model
model_wf_34v89_log <- lm(log(y) ~ X3*X4 + X8*X9 + X6 + X7, data=table_wf_resi)

# build X3/X4vX8*X9 eliminated log model
model_wf_34v89v67_log <- lm(log(y) ~ X3*X4 + X8*X9 + X6*X7, data=table_wf_resi)

# build X8/X9vX4*X3 eliminated log model
table_wf_resi <- table_wf%>% mutate(x8p9=X8/X9)
model_wf_8per9v43_log <- lm(log(y) ~ x8p9 + X4*X3 + X6 + X7, data=table_wf_resi)

# build X6/7vX8/X9vX4X3 eliminated log model
table_wf_resi <- table_wf%>% mutate(x8p9=X8/X9,x6p7=X6/X7)
model_wf_6p7v8p9v43_log <- lm(log(y) ~ x8p9 + X4*X3 + x6p7, data=table_wf_resi)

# build X8/X9vX4*X3rmX7 eliminated log model
table_wf_resi <- table_wf%>% mutate(x8p9=X8/X9)
model_wf_8per9v43rm7_log <- lm(log(y) ~ x8p9 + X4*X3 + X6, data=table_wf_resi)

# build X8/X9vX4*X3vX6/X7 eliminated log model
table_wf_resi <- table_wf%>% mutate(x8p9=X8/X9)
model_wf_8per9v43rm7_log <- lm(log(y) ~ x8p9 + X4*X3 + X6, data=table_wf_resi)

huxreg(model_wf_8per9v43rm7_log, model_wf_8per9v43_log, model_wf_43rm1_log, model_wf_6p7v8p9v43_log, model_wf_34v89_log, model_wf_34v89v67_log)
  • Interaction models
# Interaction regression for full
model_wf_full_log_inter <- lm(log(y)~ (X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_aic_log_inter <- stepAIC(model_wf_full_log_inter)
## Start:  AIC=-86.68
## log(y) ~ (X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9)^2
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 + 
##     X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X5 + X4:X6 + X4:X7 + X4:X8 + 
##     X4:X9 + X5:X6 + X5:X7 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + 
##     X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 + 
##     X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X5 + X4:X6 + X4:X7 + X4:X8 + 
##     X4:X9 + X5:X6 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + 
##     X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 + 
##     X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X5 + X4:X6 + X4:X7 + X4:X8 + 
##     X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 + 
##     X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X5 + X4:X6 + X4:X8 + X4:X9 + 
##     X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 + 
##     X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X5 + X4:X8 + X4:X9 + X5:X8 + 
##     X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 + 
##     X3:X6 + X3:X7 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + 
##     X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 + 
##     X3:X6 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + 
##     X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X5 + 
##     X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + 
##     X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X4 + X3:X8 + 
##     X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + 
##     X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X6 + X2:X7 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + 
##     X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + 
##     X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X6 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + 
##     X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X5 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + 
##     X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X4 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + 
##     X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X3 + 
##     X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + 
##     X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X7 + X1:X8 + X1:X9 + X2:X8 + 
##     X2:X9 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + 
##     X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X6 + X1:X8 + X1:X9 + X2:X8 + X2:X9 + 
##     X3:X8 + X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + 
##     X7:X8 + X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X5 + X1:X8 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + 
##     X3:X9 + X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + 
##     X7:X9 + X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X4 + X1:X8 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + 
##     X4:X8 + X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + 
##     X8:X9
## 
## 
## Step:  AIC=-86.68
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X8 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + 
##     X4:X9 + X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X1:X8  1   0.00125 0.27702 -88.546
## - X3:X9  1   0.00166 0.27743 -88.501
## - X1:X9  1   0.00171 0.27748 -88.496
## - X4:X9  1   0.00224 0.27801 -88.439
## - X5:X8  1   0.00375 0.27952 -88.276
## - X3:X8  1   0.01365 0.28942 -87.232
## - X5:X9  1   0.01394 0.28971 -87.202
## <none>               0.27577 -86.682
## - X4:X8  1   0.01926 0.29503 -86.656
## - X8:X9  1   0.02380 0.29957 -86.198
## - X1:X2  1   0.02492 0.30069 -86.086
## - X2:X8  1   0.02521 0.30098 -86.057
## - X6:X8  1   0.02975 0.30552 -85.608
## - X6:X9  1   0.03024 0.30601 -85.560
## - X2:X9  1   0.03404 0.30981 -85.190
## - X7:X8  1   0.04050 0.31627 -84.570
## - X7:X9  1   0.08581 0.36158 -80.554
## - X1:X3  1   1.65959 1.93536 -30.227
## 
## Step:  AIC=-88.55
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X4:X9 + 
##     X5:X8 + X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X4:X9  1   0.00103 0.27806 -90.434
## - X5:X8  1   0.00630 0.28332 -89.871
## - X3:X9  1   0.01467 0.29170 -88.997
## <none>               0.27702 -88.546
## - X5:X9  1   0.01990 0.29692 -88.465
## - X2:X8  1   0.02658 0.30360 -87.797
## - X1:X2  1   0.02706 0.30408 -87.750
## - X3:X8  1   0.02955 0.30658 -87.504
## - X6:X9  1   0.03164 0.30866 -87.301
## - X6:X8  1   0.03412 0.31114 -87.061
## - X4:X8  1   0.03459 0.31162 -87.015
## - X2:X9  1   0.03623 0.31325 -86.859
## - X8:X9  1   0.03768 0.31470 -86.720
## - X7:X8  1   0.04267 0.31969 -86.248
## - X1:X9  1   0.08036 0.35738 -82.904
## - X7:X9  1   0.08949 0.36652 -82.147
## - X1:X3  1   1.67908 1.95611 -31.907
## 
## Step:  AIC=-90.43
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X5:X8 + 
##     X5:X9 + X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X5:X8  1   0.01021 0.28826 -91.352
## <none>               0.27806 -90.434
## - X3:X9  1   0.01962 0.29768 -90.388
## - X1:X2  1   0.02775 0.30580 -89.580
## - X3:X8  1   0.02855 0.30661 -89.502
## - X5:X9  1   0.02886 0.30692 -89.471
## - X6:X9  1   0.03330 0.31136 -89.040
## - X8:X9  1   0.03752 0.31557 -88.637
## - X6:X8  1   0.04082 0.31888 -88.324
## - X7:X8  1   0.05026 0.32832 -87.449
## - X2:X8  1   0.07559 0.35365 -85.220
## - X1:X9  1   0.08639 0.36445 -84.317
## - X2:X9  1   0.09477 0.37282 -83.635
## - X7:X9  1   0.09547 0.37353 -83.579
## - X4:X8  1   0.11185 0.38991 -82.291
## - X1:X3  1   1.76157 2.03963 -32.653
## 
## Step:  AIC=-91.35
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X3:X9 + X4:X8 + X5:X9 + 
##     X6:X8 + X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X3:X9  1   0.01387 0.30213 -91.943
## <none>               0.28826 -91.352
## - X1:X2  1   0.02606 0.31432 -90.756
## - X6:X9  1   0.02892 0.31718 -90.484
## - X5:X9  1   0.03517 0.32343 -89.899
## - X6:X8  1   0.03526 0.32352 -89.891
## - X8:X9  1   0.03647 0.32474 -89.778
## - X7:X8  1   0.05418 0.34244 -88.186
## - X3:X8  1   0.06678 0.35505 -87.101
## - X1:X9  1   0.08233 0.37059 -85.815
## - X2:X8  1   0.09026 0.37852 -85.180
## - X4:X8  1   0.11594 0.40420 -83.211
## - X2:X9  1   0.12196 0.41023 -82.767
## - X7:X9  1   0.19579 0.48405 -77.803
## - X1:X3  1   1.79585 2.08412 -34.006
## 
## Step:  AIC=-91.94
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X6:X8 + 
##     X6:X9 + X7:X8 + X7:X9 + X8:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X6:X9  1   0.01568 0.31781 -92.425
## <none>               0.30213 -91.943
## - X6:X8  1   0.02200 0.32413 -91.834
## - X5:X9  1   0.02250 0.32464 -91.788
## - X1:X2  1   0.03453 0.33666 -90.696
## - X7:X8  1   0.04139 0.34352 -90.092
## - X8:X9  1   0.05081 0.35295 -89.279
## - X1:X9  1   0.07027 0.37241 -87.669
## - X2:X8  1   0.07640 0.37853 -87.180
## - X3:X8  1   0.09504 0.39717 -85.737
## - X4:X8  1   0.10557 0.40770 -84.952
## - X2:X9  1   0.10898 0.41111 -84.703
## - X7:X9  1   0.20828 0.51041 -78.212
## - X1:X3  1   1.80109 2.10322 -35.732
## 
## Step:  AIC=-92.43
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X6:X8 + 
##     X7:X8 + X7:X9 + X8:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X6:X8  1   0.00642 0.32423 -93.825
## - X1:X2  1   0.01993 0.33773 -92.601
## <none>               0.31781 -92.425
## - X5:X9  1   0.02449 0.34230 -92.198
## - X7:X8  1   0.02573 0.34354 -92.090
## - X8:X9  1   0.03582 0.35363 -91.221
## - X1:X9  1   0.06065 0.37846 -89.186
## - X2:X8  1   0.06122 0.37903 -89.140
## - X3:X8  1   0.08481 0.40262 -87.329
## - X4:X8  1   0.09252 0.41033 -86.760
## - X2:X9  1   0.09419 0.41200 -86.638
## - X7:X9  1   0.23406 0.55187 -77.869
## - X1:X3  1   1.89418 2.21199 -36.219
## 
## Step:  AIC=-93.83
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X7:X8 + 
##     X7:X9 + X8:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X7:X8  1   0.02050 0.34473 -93.986
## - X1:X2  1   0.02195 0.34617 -93.860
## <none>               0.32423 -93.825
## - X6     1   0.02936 0.35359 -93.225
## - X5:X9  1   0.03666 0.36089 -92.611
## - X8:X9  1   0.05990 0.38412 -90.740
## - X2:X8  1   0.10202 0.42624 -87.618
## - X2:X9  1   0.16870 0.49293 -83.258
## - X7:X9  1   0.23395 0.55817 -79.529
## - X1:X9  1   0.25322 0.57745 -78.510
## - X3:X8  1   0.26381 0.58803 -77.965
## - X4:X8  1   0.40165 0.72587 -71.647
## - X1:X3  1   1.90127 2.22550 -38.036
## 
## Step:  AIC=-93.99
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X2 + 
##     X1:X3 + X1:X9 + X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X7:X9 + 
##     X8:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X1:X2  1   0.01744 0.36216 -94.506
## <none>               0.34473 -93.986
## - X6     1   0.02380 0.36853 -93.983
## - X8:X9  1   0.04553 0.39026 -92.264
## - X5:X9  1   0.04913 0.39386 -91.988
## - X2:X8  1   0.08418 0.42891 -89.432
## - X2:X9  1   0.15080 0.49553 -85.100
## - X1:X9  1   0.27457 0.61930 -78.411
## - X3:X8  1   0.30355 0.64828 -77.039
## - X7:X9  1   0.32337 0.66809 -76.136
## - X4:X8  1   0.42623 0.77096 -71.840
## - X1:X3  1   1.88785 2.23258 -39.941
## 
## Step:  AIC=-94.51
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X1:X3 + 
##     X1:X9 + X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X7:X9 + X8:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X6     1   0.02085 0.38302 -94.826
## <none>               0.36216 -94.506
## - X8:X9  1   0.03703 0.39920 -93.585
## - X5:X9  1   0.03720 0.39936 -93.573
## - X2:X8  1   0.06882 0.43098 -91.286
## - X2:X9  1   0.13369 0.49585 -87.080
## - X1:X9  1   0.26000 0.62217 -80.272
## - X3:X8  1   0.29045 0.65261 -78.839
## - X7:X9  1   0.30630 0.66847 -78.119
## - X4:X8  1   0.40893 0.77109 -73.834
## - X1:X3  1   1.87267 2.23483 -41.911
## 
## Step:  AIC=-94.83
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X7 + X8 + X9 + X1:X3 + X1:X9 + 
##     X2:X8 + X2:X9 + X3:X8 + X4:X8 + X5:X9 + X7:X9 + X8:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X5:X9  1   0.02376 0.40678 -95.021
## - X8:X9  1   0.02579 0.40880 -94.871
## <none>               0.38302 -94.826
## - X2:X8  1   0.04986 0.43287 -93.155
## - X2:X9  1   0.11290 0.49591 -89.077
## - X1:X9  1   0.23967 0.62268 -82.247
## - X3:X8  1   0.28141 0.66443 -80.301
## - X7:X9  1   0.28680 0.66981 -80.059
## - X4:X8  1   0.39670 0.77972 -75.501
## - X1:X3  1   2.28557 2.66859 -38.589
## 
## Step:  AIC=-95.02
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X7 + X8 + X9 + X1:X3 + X1:X9 + 
##     X2:X8 + X2:X9 + X3:X8 + X4:X8 + X7:X9 + X8:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X8:X9  1   0.02190 0.42867 -95.448
## <none>               0.40678 -95.021
## - X2:X8  1   0.02957 0.43635 -94.916
## - X2:X9  1   0.08976 0.49654 -91.039
## - X5     1   0.18602 0.59280 -85.723
## - X7:X9  1   0.26382 0.67060 -82.024
## - X1:X9  1   0.32901 0.73579 -79.240
## - X3:X8  1   0.35450 0.76128 -78.219
## - X4:X8  1   0.43472 0.84149 -75.213
## - X1:X3  1   2.29575 2.70253 -40.210
## 
## Step:  AIC=-95.45
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X7 + X8 + X9 + X1:X3 + X1:X9 + 
##     X2:X8 + X2:X9 + X3:X8 + X4:X8 + X7:X9
## 
##         Df Sum of Sq     RSS     AIC
## - X2:X8  1   0.01037 0.43904 -96.731
## <none>               0.42867 -95.448
## - X2:X9  1   0.06896 0.49763 -92.973
## - X5     1   0.17518 0.60385 -87.169
## - X7:X9  1   0.24199 0.67066 -84.021
## - X1:X9  1   0.30722 0.73589 -81.236
## - X3:X8  1   0.34794 0.77661 -79.620
## - X4:X8  1   0.47278 0.90146 -75.148
## - X1:X3  1   2.41556 2.84423 -40.677
## 
## Step:  AIC=-96.73
## log(y) ~ X1 + X2 + X3 + X4 + X5 + X7 + X8 + X9 + X1:X3 + X1:X9 + 
##     X2:X9 + X3:X8 + X4:X8 + X7:X9
## 
##         Df Sum of Sq     RSS     AIC
## <none>               0.43904 -96.731
## - X5     1   0.17280 0.61185 -88.774
## - X7:X9  1   0.24527 0.68431 -85.416
## - X2:X9  1   0.30033 0.73937 -83.095
## - X1:X9  1   0.31184 0.75088 -82.631
## - X3:X8  1   0.40524 0.84428 -79.114
## - X4:X8  1   0.66804 1.10708 -70.984
## - X1:X3  1   2.49992 2.93897 -41.694
# Interaction regression for remove X1-9
model_wf_rm1_log_inter <- lm(log(y) ~ (X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_rm1_aic_log_inter <- stepAIC(model_wf_rm1_log_inter)
model_wf_rm2_log_inter <- lm(log(y) ~ (X1 + X3 + X4 + X5 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_rm2_aic_log_inter <- stepAIC(model_wf_rm2_log_inter)
model_wf_rm3_log_inter <- lm(log(y) ~ (X1 + X2 + X4 + X5 + X6 + X7+ X8 + X9)^2, data=table_wf)
model_wf_rm3_aic_log_inter <- stepAIC(model_wf_rm3_log_inter)
model_wf_rm5_log_inter <- lm(log(y) ~ (X1 + X2 + X3 + X4 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_rm5_aic_log_inter <- stepAIC(model_wf_rm5_log_inter)
model_wf_rm4_log_inter <- lm(log(y) ~ (X1 + X2 + X3 + X5 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_rm4_aic_log_inter <- stepAIC(model_wf_rm4_log_inter)
model_wf_rm6_log_inter <- lm(log(y) ~ (X2 + X3 + X1 + X5 + X4 + X7 + X8 + X9)^2, data=table_wf)
model_wf_rm6_aic_log_inter <- stepAIC(model_wf_rm6_log_inter)
model_wf_rm7_log_inter <- lm(log(y) ~ (X1 + X2 + X3 + X4 + X5 + X6 + X8 + X9)^2, data=table_wf)
model_wf_rm7_aic_log_inter <- stepAIC(model_wf_rm7_log_inter)
model_wf_rm8_log_inter <- lm(log(y) ~ (X1 + X2 + X3 + X4 + X5 + X6 + X7 + X9)^2, data=table_wf)
model_wf_rm8_aic_log_inter <- stepAIC(model_wf_rm8_log_inter)
model_wf_rm9_log_inter <- lm(log(y) ~ (X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8)^2, data=table_wf)
model_wf_rm9_aic_log_inter <- stepAIC(model_wf_rm9_log_inter)

# Interaction regression for 136789
model_wf_136789_log_inter <- lm(log(y) ~ (X3 + X1 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_136789_aic_log_inter <- stepAIC(model_wf_136789_log_inter)
# Interaction regression for 436789
model_wf_436789_log_inter <- lm(log(y) ~ (X3 + X4 + X6 + X7 + X8 + X9)^2, data=table_wf)
model_wf_436789_aic_log_inter <- stepAIC(model_wf_436789_log_inter)
# Interaction regression for 437
model_wf_437_log_inter <- lm(log(y) ~ (X3 + X4 + X7 )^2, data=table_wf)
model_wf_437_aic_log_inter <- stepAIC(model_wf_437_log_inter)

# Interaction regression for 489
model_wf_489_log_inter <- lm(log(y) ~ (X4 + X8 + X9 )^2, data=table_wf)
model_wf_489_aic_log_inter <- stepAIC(model_wf_489_log_inter)

# Interaction regression by groups
model_wf_3g_log_inter <- lm(log(y) ~ (log(X4)  + log(X6))^2 + log(X8) + log(X9), data=table_wf)
model_wf_3g_aic_log_inter <- stepAIC(model_wf_3g_log_inter)

# Interaction regression by groups1
model_wf_3g1_log_inter <- lm(log(y) ~ (log(X3) +log(X7))^2 + log(X8) + log(X9), data=table_wf)
model_wf_3g1_aic_log_inter <- stepAIC(model_wf_3g1_log_inter)

# Comparison
huxreg(model_wf_rm1_aic_log_inter, model_wf_rm2_aic_log_inter, model_wf_rm3_aic_log_inter, model_wf_rm4_aic_log_inter, model_wf_rm5_aic_log_inter, model_wf_rm6_aic_log_inter, model_wf_rm7_aic_log_inter, model_wf_rm8_aic_log_inter, model_wf_rm9_aic_log_inter, model_wf_aic_log_inter)

huxreg(model_wf_136789_aic_log_inter,model_wf_436789_aic_log_inter, model_wf_437_log_inter, model_wf_489_log_inter, model_wf_3g_aic_log_inter, model_wf_3g1_aic_log_inter)
  • All log models
# build all log model
model_wf_all_log <- lm(log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + log(X7) + log(X8) + log(X9), data=table_wf)
ols_vif_tol(model_wf_all_log)
Variables Tolerance VIF
log(X1) 0.00608 164   
log(X2) 0.0865  11.6 
log(X3) 0.0816  12.3 
log(X4) 0.00767 130   
log(X5) 0.0885  11.3 
log(X6) 0.421   2.37
log(X7) 0.108   9.29
log(X8) 0.193   5.18
log(X9) 0.187   5.35
model_wf_aic_all_log <- stepAIC(model_wf_all_log)
## Start:  AIC=-84.46
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9)
## 
##           Df Sum of Sq    RSS     AIC
## - log(X7)  1    0.0027 0.9251 -86.371
## - log(X2)  1    0.0310 0.9534 -85.468
## - log(X4)  1    0.0370 0.9595 -85.277
## - log(X3)  1    0.0525 0.9750 -84.797
## - log(X5)  1    0.0574 0.9798 -84.647
## <none>                 0.9224 -84.459
## - log(X6)  1    0.2329 1.1553 -79.705
## - log(X1)  1    0.2712 1.1936 -78.727
## - log(X9)  1    3.4818 4.4043 -39.559
## - log(X8)  1    3.6487 4.5711 -38.443
## 
## Step:  AIC=-86.37
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X8) + log(X9)
## 
##           Df Sum of Sq    RSS     AIC
## - log(X4)  1    0.0370 0.9621 -87.193
## - log(X2)  1    0.0559 0.9810 -86.611
## <none>                 0.9251 -86.371
## - log(X3)  1    0.0777 1.0028 -85.953
## - log(X5)  1    0.0983 1.0234 -85.341
## - log(X6)  1    0.3174 1.2425 -79.522
## - log(X1)  1    0.3899 1.3150 -77.820
## - log(X9)  1    3.4793 4.4044 -41.558
## - log(X8)  1    3.6745 4.5996 -40.257
## 
## Step:  AIC=-87.19
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X5) + log(X6) + log(X8) + 
##     log(X9)
## 
##           Df Sum of Sq    RSS     AIC
## - log(X2)  1    0.0369 0.9990 -88.065
## <none>                 0.9621 -87.193
## - log(X5)  1    0.1419 1.1040 -85.067
## - log(X6)  1    0.2985 1.2607 -81.087
## - log(X3)  1    0.5087 1.4709 -76.460
## - log(X9)  1    3.4515 4.4137 -43.495
## - log(X8)  1    3.6420 4.6042 -42.227
## - log(X1)  1    3.8134 4.7755 -41.131
## 
## Step:  AIC=-88.07
## log(y) ~ log(X1) + log(X3) + log(X5) + log(X6) + log(X8) + log(X9)
## 
##           Df Sum of Sq     RSS     AIC
## <none>                  0.9990 -88.065
## - log(X5)  1    0.1087  1.1077 -86.967
## - log(X6)  1    0.3805  1.3795 -80.384
## - log(X3)  1    0.8252  1.8242 -72.001
## - log(X9)  1    3.4549  4.4539 -45.222
## - log(X8)  1    3.7305  4.7295 -43.421
## - log(X1)  1   17.5601 18.5592  -2.407
ols_vif_tol(model_wf_aic_all_log)
Variables Tolerance VIF
log(X1) 0.263 3.8 
log(X3) 0.603 1.66
log(X5) 0.22  4.55
log(X6) 0.71  1.41
log(X8) 0.201 4.99
log(X9) 0.191 5.22
# Interaction regression for all log  model
model_wf_all_log_inter <- lm(log(y) ~ (log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + log(X7) + log(X8) + log(X9))^2, data=table_wf)
model_wf_aic_all_log_inter <- stepAIC(model_wf_all_log_inter)
## Start:  AIC=-117.75
## log(y) ~ (log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9))^2
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) + 
##     log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X5) + 
##     log(X4):log(X6) + log(X4):log(X7) + log(X4):log(X8) + log(X4):log(X9) + 
##     log(X5):log(X6) + log(X5):log(X7) + log(X5):log(X8) + log(X5):log(X9) + 
##     log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + 
##     log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) + 
##     log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X5) + 
##     log(X4):log(X6) + log(X4):log(X7) + log(X4):log(X8) + log(X4):log(X9) + 
##     log(X5):log(X6) + log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + 
##     log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) + 
##     log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X5) + 
##     log(X4):log(X6) + log(X4):log(X7) + log(X4):log(X8) + log(X4):log(X9) + 
##     log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + 
##     log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) + 
##     log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X5) + 
##     log(X4):log(X6) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + 
##     log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + 
##     log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) + 
##     log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X5) + 
##     log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + 
##     log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + 
##     log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) + 
##     log(X3):log(X7) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + 
##     log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + 
##     log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X6) + 
##     log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + 
##     log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + 
##     log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X5) + log(X3):log(X8) + 
##     log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + 
##     log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + 
##     log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X4) + log(X3):log(X8) + log(X3):log(X9) + 
##     log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + 
##     log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + 
##     log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + 
##     log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + 
##     log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X8) + log(X2):log(X9) + 
##     log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + 
##     log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + 
##     log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X5) + log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X8) + 
##     log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + 
##     log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + 
##     log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X4) + 
##     log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X8) + log(X3):log(X9) + 
##     log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + 
##     log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + 
##     log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X3) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + 
##     log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + 
##     log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X7) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) + 
##     log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + 
##     log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + 
##     log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X6) + log(X1):log(X8) + 
##     log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X8) + 
##     log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + 
##     log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + 
##     log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X5) + log(X1):log(X8) + log(X1):log(X9) + 
##     log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X8) + log(X3):log(X9) + 
##     log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + 
##     log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + 
##     log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X4) + log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) + 
##     log(X2):log(X9) + log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + 
##     log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + 
##     log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
## 
## 
## Step:  AIC=-117.75
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) + 
##     log(X3):log(X8) + log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + 
##     log(X5):log(X8) + log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + 
##     log(X7):log(X8) + log(X7):log(X9) + log(X8):log(X9)
## 
##                   Df Sum of Sq      RSS     AIC
## - log(X3):log(X8)  1 0.0000903 0.097990 -119.72
## - log(X3):log(X9)  1 0.0003856 0.098286 -119.63
## - log(X7):log(X8)  1 0.0008771 0.098777 -119.48
## - log(X2):log(X9)  1 0.0009602 0.098860 -119.46
## - log(X2):log(X8)  1 0.0012711 0.099171 -119.36
## - log(X4):log(X9)  1 0.0013307 0.099231 -119.34
## - log(X6):log(X8)  1 0.0013914 0.099291 -119.33
## - log(X7):log(X9)  1 0.0014236 0.099324 -119.32
## - log(X5):log(X8)  1 0.0014490 0.099349 -119.31
## - log(X5):log(X9)  1 0.0015560 0.099456 -119.28
## - log(X4):log(X8)  1 0.0016138 0.099514 -119.26
## - log(X6):log(X9)  1 0.0016209 0.099521 -119.26
## - log(X1):log(X9)  1 0.0025113 0.100411 -118.99
## - log(X1):log(X2)  1 0.0025483 0.100448 -118.98
## - log(X8):log(X9)  1 0.0025636 0.100464 -118.97
## - log(X1):log(X8)  1 0.0029919 0.100892 -118.85
## <none>                         0.097900 -117.75
## - log(X1):log(X3)  1 0.0093915 0.107292 -117.00
## 
## Step:  AIC=-119.72
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) + 
##     log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + 
##     log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + 
##     log(X7):log(X9) + log(X8):log(X9)
## 
##                   Df Sum of Sq     RSS     AIC
## - log(X8):log(X9)  1  0.004379 0.10237 -120.41
## - log(X1):log(X2)  1  0.004551 0.10254 -120.36
## - log(X3):log(X9)  1  0.004640 0.10263 -120.33
## <none>                         0.09799 -119.72
## - log(X7):log(X8)  1  0.006852 0.10484 -119.69
## - log(X7):log(X9)  1  0.011626 0.10962 -118.36
## - log(X6):log(X8)  1  0.012314 0.11030 -118.17
## - log(X6):log(X9)  1  0.016183 0.11417 -117.14
## - log(X1):log(X9)  1  0.022498 0.12049 -115.52
## - log(X4):log(X9)  1  0.024248 0.12224 -115.09
## - log(X1):log(X8)  1  0.025269 0.12326 -114.84
## - log(X2):log(X9)  1  0.025677 0.12367 -114.74
## - log(X1):log(X3)  1  0.027330 0.12532 -114.34
## - log(X5):log(X9)  1  0.029018 0.12701 -113.94
## - log(X4):log(X8)  1  0.030440 0.12843 -113.61
## - log(X5):log(X8)  1  0.031030 0.12902 -113.47
## - log(X2):log(X8)  1  0.034946 0.13294 -112.57
## 
## Step:  AIC=-120.41
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X2) + log(X1):log(X3) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) + 
##     log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + 
##     log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + 
##     log(X7):log(X9)
## 
##                   Df Sum of Sq     RSS     AIC
## - log(X1):log(X2)  1 0.0019105 0.10428 -121.86
## - log(X7):log(X8)  1 0.0042768 0.10665 -121.18
## <none>                         0.10237 -120.41
## - log(X7):log(X9)  1 0.0085205 0.11089 -120.01
## - log(X6):log(X8)  1 0.0088118 0.11118 -119.93
## - log(X6):log(X9)  1 0.0122126 0.11458 -119.03
## - log(X1):log(X9)  1 0.0181216 0.12049 -117.52
## - log(X4):log(X9)  1 0.0203055 0.12268 -116.98
## - log(X3):log(X9)  1 0.0205839 0.12295 -116.91
## - log(X1):log(X8)  1 0.0209330 0.12330 -116.83
## - log(X2):log(X9)  1 0.0213148 0.12368 -116.74
## - log(X1):log(X3)  1 0.0237155 0.12609 -116.16
## - log(X5):log(X9)  1 0.0249321 0.12730 -115.87
## - log(X4):log(X8)  1 0.0267336 0.12910 -115.45
## - log(X5):log(X8)  1 0.0270663 0.12944 -115.37
## - log(X2):log(X8)  1 0.0305761 0.13295 -114.57
## 
## Step:  AIC=-121.86
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X8) + 
##     log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X9) + 
##     log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + 
##     log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X8) + log(X7):log(X9)
## 
##                   Df Sum of Sq     RSS     AIC
## - log(X7):log(X8)  1 0.0055209 0.10980 -122.31
## <none>                         0.10428 -121.86
## - log(X7):log(X9)  1 0.0100914 0.11437 -121.08
## - log(X6):log(X8)  1 0.0111603 0.11544 -120.81
## - log(X6):log(X9)  1 0.0144674 0.11875 -119.96
## - log(X2):log(X9)  1 0.0220327 0.12631 -118.11
## - log(X1):log(X9)  1 0.0224953 0.12677 -118.00
## - log(X3):log(X9)  1 0.0225020 0.12678 -118.00
## - log(X4):log(X9)  1 0.0227039 0.12698 -117.95
## - log(X1):log(X3)  1 0.0232574 0.12754 -117.82
## - log(X5):log(X9)  1 0.0244629 0.12874 -117.53
## - log(X5):log(X8)  1 0.0257774 0.13006 -117.23
## - log(X1):log(X8)  1 0.0262811 0.13056 -117.11
## - log(X4):log(X8)  1 0.0296051 0.13389 -116.36
## - log(X2):log(X8)  1 0.0312323 0.13551 -116.00
## 
## Step:  AIC=-122.31
## log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X8) + 
##     log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) + log(X3):log(X9) + 
##     log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + log(X5):log(X9) + 
##     log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X9)
## 
##                   Df Sum of Sq     RSS      AIC
## <none>                         0.10980 -122.309
## - log(X3):log(X9)  1  0.020900 0.13070 -119.081
## - log(X1):log(X3)  1  0.021290 0.13109 -118.992
## - log(X7):log(X9)  1  0.023868 0.13367 -118.408
## - log(X6):log(X8)  1  0.030444 0.14024 -116.967
## - log(X5):log(X8)  1  0.032198 0.14200 -116.594
## - log(X1):log(X9)  1  0.034511 0.14431 -116.109
## - log(X5):log(X9)  1  0.036577 0.14638 -115.683
## - log(X6):log(X9)  1  0.057094 0.16689 -111.748
## - log(X1):log(X8)  1  0.058955 0.16876 -111.415
## - log(X2):log(X9)  1  0.067720 0.17752 -109.896
## - log(X4):log(X9)  1  0.086014 0.19581 -106.953
## - log(X2):log(X8)  1  0.117909 0.22771 -102.426
## - log(X4):log(X8)  1  0.199440 0.30924  -93.245
huxreg(model_wf_aic_log, model_wf_aic_all_log, model_wf_aic_log_inter, model_wf_aic_all_log_inter)
(1) (2) (3) (4)
(Intercept) 2.692 *** 0.571     -0.981     -16.027   
(0.445)    (3.360)    (1.520)    (13.947)  
X3 0.184 ***          0.493 ***        
(0.032)             (0.066)           
X4 0.109 ***          0.168 **         
(0.026)             (0.054)           
X6 -0.368 *                            
(0.146)                             
X7 4.085 **           2.515            
(1.213)             (1.258)           
X8 0.612 ***          0.586 ***        
(0.133)             (0.087)           
X9 -0.448 ***          -0.248 *          
(0.108)             (0.105)           
log(X1)          0.726 ***          1.158   
         (0.036)             (0.697)  
log(X3)          0.419 ***          1.666   
         (0.096)             (1.763)  
log(X5)          1.259              5.028   
         (0.796)             (3.473)  
log(X6)          -0.267 **           -0.158   
         (0.090)             (0.300)  
log(X8)          1.623 ***          44.919   
         (0.175)             (24.175)  
log(X9)          -1.375 ***          -37.066   
         (0.154)             (19.170)  
X1                   2.712 ***        
                  (0.330)           
X2                   -24.452 ***        
                  (5.631)           
X5                   0.039 *          
                  (0.016)           
X1:X3                   -0.416 ***        
                  (0.045)           
X1:X9                   -0.118 **         
                  (0.036)           
X2:X9                   4.643 **         
                  (1.449)           
X3:X8                   -0.051 **         
                  (0.014)           
X4:X8                   0.071 ***        
                  (0.015)           
X7:X9                   -0.995 *          
                  (0.344)           
log(X2)                            -0.733   
                           (0.338)  
log(X4)                            -2.394   
                           (2.963)  
log(X7)                            -0.408   
                           (0.518)  
log(X1):log(X3)                            0.823   
                           (0.707)  
log(X1):log(X8)                            1.242   
                           (0.641)  
log(X1):log(X9)                            -1.000   
                           (0.674)  
log(X2):log(X8)                            1.089 * 
                           (0.397)  
log(X2):log(X9)                            -0.710   
                           (0.341)  
log(X3):log(X9)                            0.411   
                           (0.356)  
log(X4):log(X8)                            -3.029 **
                           (0.849)  
log(X4):log(X9)                            2.174   
                           (0.929)  
log(X5):log(X8)                            -7.969   
                           (5.562)  
log(X5):log(X9)                            6.795   
                           (4.450)  
log(X6):log(X8)                            0.403   
                           (0.289)  
log(X6):log(X9)                            -0.506   
                           (0.265)  
log(X7):log(X9)                            0.464   
                           (0.376)  
N 30         30         30         30       
R2 0.947     0.986     0.994     0.998   
logLik -11.581     8.464     20.797     41.586   
AIC 39.161     -0.929     -9.594     -35.172   
*** p < 0.001; ** p < 0.01; * p < 0.05.
  • Mixed models
# Mixed regression 1
model_wf_mix1 <- lm(log(y) ~ (X1 + X3 + X4 )^2 + log(X2+ X5+X6 + X7) + log(X8) + log(X9), data=table_wf)
model_wf_aic_mix1 <- stepAIC(model_wf_mix1)
## Start:  AIC=-80.71
## log(y) ~ (X1 + X3 + X4)^2 + log(X2 + X5 + X6 + X7) + log(X8) + 
##     log(X9)
## 
## 
## Step:  AIC=-80.71
## log(y) ~ X1 + X3 + X4 + log(X2 + X5 + X6 + X7) + log(X8) + log(X9) + 
##     X1:X3 + X1:X4
## 
##                          Df Sum of Sq    RSS     AIC
## - log(X2 + X5 + X6 + X7)  1    0.0122 1.1294 -82.385
## <none>                                1.1172 -80.711
## - X1:X4                   1    0.2938 1.4110 -75.706
## - X1:X3                   1    1.8665 2.9837 -53.241
## - log(X9)                 1    3.2544 4.3716 -41.782
## - log(X8)                 1    3.4346 4.5518 -40.570
## 
## Step:  AIC=-82.38
## log(y) ~ X1 + X3 + X4 + log(X8) + log(X9) + X1:X3 + X1:X4
## 
##           Df Sum of Sq    RSS     AIC
## <none>                 1.1294 -82.385
## - X1:X4    1    0.3135 1.4429 -77.036
## - X1:X3    1    2.3929 3.5224 -50.262
## - log(X9)  1    3.9030 5.0324 -39.559
## - log(X8)  1    4.3068 5.4362 -37.243
# Mixed regression 2
model_wf_mix2 <- lm(log(y) ~ (log(X1) + log(X3) + log(X4) )^2 + (log(X2)+log(X5)+log(X6)+log(X7))^2 + log(X8) + log(X9), data=table_wf)
model_wf_aic_mix2 <- stepAIC(model_wf_mix2)
## Start:  AIC=-81.81
## log(y) ~ (log(X1) + log(X3) + log(X4))^2 + (log(X2) + log(X5) + 
##     log(X6) + log(X7))^2 + log(X8) + log(X9)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) + 
##     log(X3):log(X4) + log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + 
##     log(X5):log(X6) + log(X5):log(X7)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) + 
##     log(X3):log(X4) + log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7) + 
##     log(X5):log(X6)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) + 
##     log(X3):log(X4) + log(X2):log(X5) + log(X2):log(X6) + log(X2):log(X7)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) + 
##     log(X3):log(X4) + log(X2):log(X5) + log(X2):log(X6)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) + 
##     log(X3):log(X4) + log(X2):log(X5)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4) + 
##     log(X3):log(X4)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X6) + 
##     log(X7) + log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4)
## 
##                   Df Sum of Sq    RSS     AIC
## - log(X6)          1    0.0002 0.8820 -83.802
## - log(X1):log(X4)  1    0.0015 0.8833 -83.759
## - log(X5)          1    0.0082 0.8900 -83.531
## - log(X2)          1    0.0112 0.8930 -83.431
## - log(X1):log(X3)  1    0.0183 0.9001 -83.195
## - log(X7)          1    0.0187 0.9005 -83.181
## <none>                         0.8818 -81.810
## - log(X9)          1    3.4852 4.3670 -35.813
## - log(X8)          1    3.6405 4.5223 -34.765
## 
## Step:  AIC=-83.8
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X7) + 
##     log(X8) + log(X9) + log(X1):log(X3) + log(X1):log(X4)
## 
##                   Df Sum of Sq    RSS     AIC
## - log(X1):log(X4)  1    0.0043 0.8863 -85.658
## - log(X5)          1    0.0305 0.9125 -84.783
## - log(X2)          1    0.0553 0.9374 -83.977
## <none>                         0.8820 -83.802
## - log(X7)          1    0.1162 0.9982 -82.091
## - log(X1):log(X3)  1    0.1300 1.0120 -81.678
## - log(X9)          1    3.4883 4.3704 -37.791
## - log(X8)          1    3.6480 4.5300 -36.714
## 
## Step:  AIC=-85.66
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X5) + log(X7) + 
##     log(X8) + log(X9) + log(X1):log(X3)
## 
##                   Df Sum of Sq    RSS     AIC
## - log(X5)          1    0.0276 0.9138 -86.739
## - log(X2)          1    0.0541 0.9404 -85.880
## <none>                         0.8863 -85.658
## - log(X7)          1    0.1577 1.0440 -82.744
## - log(X1):log(X3)  1    0.2690 1.1553 -79.705
## - log(X4)          1    0.2899 1.1762 -79.167
## - log(X9)          1    3.5066 4.3929 -39.636
## - log(X8)          1    3.6607 4.5470 -38.602
## 
## Step:  AIC=-86.74
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X2) + log(X7) + log(X8) + 
##     log(X9) + log(X1):log(X3)
## 
##                   Df Sum of Sq    RSS     AIC
## - log(X2)          1    0.0365 0.9503 -87.565
## <none>                         0.9138 -86.739
## - log(X7)          1    0.1453 1.0591 -84.312
## - log(X1):log(X3)  1    0.8295 1.7433 -69.362
## - log(X4)          1    0.9236 1.8374 -67.785
## - log(X9)          1    3.7486 4.6625 -39.850
## - log(X8)          1    4.2005 5.1143 -37.075
## 
## Step:  AIC=-87.56
## log(y) ~ log(X1) + log(X3) + log(X4) + log(X7) + log(X8) + log(X9) + 
##     log(X1):log(X3)
## 
##                   Df Sum of Sq    RSS     AIC
## <none>                         0.9503 -87.565
## - log(X7)          1    0.1793 1.1296 -84.379
## - log(X1):log(X3)  1    0.7962 1.7466 -71.307
## - log(X4)          1    0.9195 1.8698 -69.261
## - log(X9)          1    4.1464 5.0968 -39.178
## - log(X8)          1    4.5091 5.4594 -37.116
# Mixed regression 3
model_wf_mix3 <- lm(log(y) ~ (X1 + X3 + X4 )^2 + (log(X2)+log(X5)+log(X6)+log(X7))^2 + log(X8) + log(X9), data=table_wf)
model_wf_aic_mix3 <- stepAIC(model_wf_mix3)
## Start:  AIC=-81.81
## log(y) ~ (X1 + X3 + X4)^2 + (log(X2) + log(X5) + log(X6) + log(X7))^2 + 
##     log(X8) + log(X9)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) + 
##     log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4 + log(X2):log(X5) + 
##     log(X2):log(X6) + log(X2):log(X7) + log(X5):log(X6) + log(X5):log(X7)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) + 
##     log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4 + log(X2):log(X5) + 
##     log(X2):log(X6) + log(X2):log(X7) + log(X5):log(X6)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) + 
##     log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4 + log(X2):log(X5) + 
##     log(X2):log(X6) + log(X2):log(X7)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) + 
##     log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4 + log(X2):log(X5) + 
##     log(X2):log(X6)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) + 
##     log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4 + log(X2):log(X5)
## 
## 
## Step:  AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) + 
##     log(X8) + log(X9) + X1:X3 + X1:X4 + X3:X4
## 
## 
## Step:  AIC=-81.81
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X6) + log(X7) + 
##     log(X8) + log(X9) + X1:X3 + X1:X4
## 
##           Df Sum of Sq    RSS     AIC
## - log(X6)  1    0.0002 0.8820 -83.802
## - log(X5)  1    0.0082 0.8900 -83.531
## - log(X2)  1    0.0112 0.8930 -83.431
## - log(X7)  1    0.0187 0.9005 -83.181
## - X1:X4    1    0.0379 0.9197 -82.547
## <none>                 0.8818 -81.810
## - X1:X3    1    0.3016 1.1834 -74.984
## - log(X9)  1    3.4852 4.3670 -35.813
## - log(X8)  1    3.6405 4.5223 -34.765
## 
## Step:  AIC=-83.8
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X5) + log(X7) + log(X8) + 
##     log(X9) + X1:X3 + X1:X4
## 
##           Df Sum of Sq    RSS     AIC
## - log(X5)  1    0.0305 0.9125 -84.783
## - log(X2)  1    0.0553 0.9374 -83.977
## <none>                 0.8820 -83.802
## - log(X7)  1    0.1162 0.9982 -82.091
## - X1:X4    1    0.1973 1.0793 -79.746
## - X1:X3    1    1.9085 2.7905 -51.249
## - log(X9)  1    3.4883 4.3704 -37.791
## - log(X8)  1    3.6480 4.5300 -36.714
## 
## Step:  AIC=-84.78
## log(y) ~ X1 + X3 + X4 + log(X2) + log(X7) + log(X8) + log(X9) + 
##     X1:X3 + X1:X4
## 
##           Df Sum of Sq    RSS     AIC
## - log(X2)  1    0.0277 0.9402 -85.886
## <none>                 0.9125 -84.783
## - log(X7)  1    0.1121 1.0246 -83.308
## - X1:X4    1    0.1731 1.0856 -81.571
## - X1:X3    1    1.8806 2.7930 -53.222
## - log(X9)  1    3.6887 4.6012 -38.246
## - log(X8)  1    4.1243 5.0368 -35.533
## 
## Step:  AIC=-85.89
## log(y) ~ X1 + X3 + X4 + log(X7) + log(X8) + log(X9) + X1:X3 + 
##     X1:X4
## 
##           Df Sum of Sq    RSS     AIC
## <none>                 0.9402 -85.886
## - log(X7)  1    0.1892 1.1294 -82.385
## - X1:X4    1    0.2211 1.1613 -81.549
## - X1:X3    1    2.5494 3.4896 -48.542
## - log(X9)  1    4.0912 5.0314 -37.565
## - log(X8)  1    4.4859 5.4261 -35.300
huxreg(model_wf_aic_mix1, model_wf_aic_mix2, model_wf_aic_mix3)
(1) (2) (3)
(Intercept) 2.598 *** 2.908 *** 2.044 ***
(0.228)    (0.702)    (0.343)   
X1 1.290 ***          1.459 ***
(0.232)             (0.232)   
X3 0.296 ***          0.275 ***
(0.040)             (0.039)   
X4 0.391 ***          0.423 ***
(0.042)             (0.042)   
log(X8) 1.575 *** 1.631 *** 1.628 ***
(0.172)    (0.160)    (0.163)   
log(X9) -1.345 *** -1.411 *** -1.405 ***
(0.154)    (0.144)    (0.147)   
X1:X3 -0.381 ***          -0.398 ***
(0.056)             (0.053)   
X1:X4 0.031 *            0.026 *  
(0.012)             (0.012)   
log(X1)          0.143             
         (0.184)            
log(X3)          -1.810 **          
         (0.490)            
log(X4)          3.546 ***         
         (0.769)            
log(X7)          -0.408     -0.428    
         (0.200)    (0.208)   
log(X1):log(X3)          -0.756 ***         
         (0.176)            
N 30         30         30        
R2 0.984     0.987     0.987    
logLik 6.624     9.214     9.375    
AIC 4.752     -0.428     1.250    
*** p < 0.001; ** p < 0.01; * p < 0.05.

Additional Steps

  • step both
# Stepwise Regression based on p values for full model#
k <- ols_step_both_p(model_wf_full_log)
## Stepwise Selection Method   
## ---------------------------
## 
## Candidate Terms: 
## 
## 1. X1 
## 2. X2 
## 3. X3 
## 4. X4 
## 5. X5 
## 6. X6 
## 7. X7 
## 8. X8 
## 9. X9 
## 
## We are selecting variables based on p value...
## 
## Variables Entered/Removed: 
## 
## - X4 added 
## - X3 added 
## - X7 added 
## 
## No more variables to be added/removed.
## 
## 
## Final Model Output 
## ------------------
## 
##                         Model Summary                         
## -------------------------------------------------------------
## R                       0.944       RMSE               0.549 
## R-Squared               0.890       Coef. Var          8.618 
## Adj. R-Squared          0.878       MSE                0.301 
## Pred R-Squared          0.854       MAE                0.414 
## -------------------------------------------------------------
##  RMSE: Root Mean Square Error 
##  MSE: Mean Square Error 
##  MAE: Mean Absolute Error 
## 
##                                ANOVA                                
## -------------------------------------------------------------------
##                Sum of                                              
##               Squares        DF    Mean Square      F         Sig. 
## -------------------------------------------------------------------
## Regression     63.565         3         21.188    70.378    0.0000 
## Residual        7.828        26          0.301                     
## Total          71.393        29                                    
## -------------------------------------------------------------------
## 
##                                  Parameter Estimates                                  
## -------------------------------------------------------------------------------------
##       model     Beta    Std. Error    Std. Beta      t       Sig      lower    upper 
## -------------------------------------------------------------------------------------
## (Intercept)    2.872         0.547                 5.254    0.000     1.748    3.995 
##          X4    0.122         0.033        0.559    3.730    0.001     0.055    0.189 
##          X3    0.168         0.040        0.435    4.165    0.000     0.085    0.251 
##          X7    3.106         1.537        0.309    2.021    0.054    -0.053    6.266 
## -------------------------------------------------------------------------------------
 k
## 
##                              Stepwise Selection Summary                              
## ------------------------------------------------------------------------------------
##                      Added/                   Adj.                                      
## Step    Variable    Removed     R-Square    R-Square     C(p)        AIC       RMSE     
## ------------------------------------------------------------------------------------
##    1       X4       addition       0.803       0.796    48.8550    68.4060    0.7087    
##    2       X3       addition       0.873       0.864    24.2130    57.2082    0.5792    
##    3       X7       addition       0.890       0.878    19.6670    54.8305    0.5487    
## ------------------------------------------------------------------------------------
# plot(k)

# Stepwise AIC Regression for full model#
k<- ols_step_both_aic(model_wf_full_log)
## Stepwise Selection Method 
## -------------------------
## 
## Candidate Terms: 
## 
## 1 . X1 
## 2 . X2 
## 3 . X3 
## 4 . X4 
## 5 . X5 
## 6 . X6 
## 7 . X7 
## 8 . X8 
## 9 . X9 
## 
## 
## Variables Entered/Removed: 
## 
## - X4 added 
## - X3 added 
## - X7 added 
## - X8 added 
## - X9 added 
## - X6 added 
## 
## No more variables to be added or removed.
 k
## 
## 
##                               Stepwise Summary                              
## --------------------------------------------------------------------------
## Variable     Method      AIC       RSS      Sum Sq     R-Sq      Adj. R-Sq 
## --------------------------------------------------------------------------
## X4          addition    68.406    14.063    57.330    0.80302      0.79599 
## X3          addition    57.208     9.057    62.335    0.87313      0.86373 
## X7          addition    54.830     7.828    63.565    0.89036      0.87771 
## X8          addition    54.522     7.248    64.144    0.89848      0.88223 
## X9          addition    44.504     4.856    66.537    0.93199      0.91782 
## X6          addition    39.161     3.801    67.591    0.94675      0.93286 
## --------------------------------------------------------------------------
# plot(k)

# Stepwise Regression based on p values for all log model #
k <- ols_step_both_p(model_wf_all_log)
## Stepwise Selection Method   
## ---------------------------
## 
## Candidate Terms: 
## 
## 1. log(X1) 
## 2. log(X2) 
## 3. log(X3) 
## 4. log(X4) 
## 5. log(X5) 
## 6. log(X6) 
## 7. log(X7) 
## 8. log(X8) 
## 9. log(X9) 
## 
## We are selecting variables based on p value...
## 
## Variables Entered/Removed: 
## 
## - log(X4) added 
## 
## No more variables to be added/removed.
## 
## 
## Final Model Output 
## ------------------
## 
##                         Model Summary                         
## -------------------------------------------------------------
## R                       0.954       RMSE               0.479 
## R-Squared               0.910       Coef. Var          7.526 
## Adj. R-Squared          0.907       MSE                0.230 
## Pred R-Squared          0.896       MAE                0.353 
## -------------------------------------------------------------
##  RMSE: Root Mean Square Error 
##  MSE: Mean Square Error 
##  MAE: Mean Absolute Error 
## 
##                                ANOVA                                 
## --------------------------------------------------------------------
##                Sum of                                               
##               Squares        DF    Mean Square       F         Sig. 
## --------------------------------------------------------------------
## Regression     64.964         1         64.964    282.937    0.0000 
## Residual        6.429        28          0.230                      
## Total          71.393        29                                     
## --------------------------------------------------------------------
## 
##                                  Parameter Estimates                                  
## -------------------------------------------------------------------------------------
##       model     Beta    Std. Error    Std. Beta      t        Sig     lower    upper 
## -------------------------------------------------------------------------------------
## (Intercept)    4.189         0.156                 26.803    0.000    3.868    4.509 
##     log(X4)    1.259         0.075        0.954    16.821    0.000    1.106    1.413 
## -------------------------------------------------------------------------------------
k
## 
##                              Stepwise Selection Summary                               
## -------------------------------------------------------------------------------------
##                      Added/                   Adj.                                       
## Step    Variable    Removed     R-Square    R-Square      C(p)        AIC       RMSE     
## -------------------------------------------------------------------------------------
##    1    log(X4)     addition       0.910       0.907    113.3920    44.9246    0.4792    
## -------------------------------------------------------------------------------------
#plot(k)

# Stepwise AIC Regression for all log model #
k <- ols_step_both_aic(model_wf_all_log)
## Stepwise Selection Method 
## -------------------------
## 
## Candidate Terms: 
## 
## 1 . log(X1) 
## 2 . log(X2) 
## 3 . log(X3) 
## 4 . log(X4) 
## 5 . log(X5) 
## 6 . log(X6) 
## 7 . log(X7) 
## 8 . log(X8) 
## 9 . log(X9) 
## 
## 
## Variables Entered/Removed: 
## 
## - log(X4) added 
## - log(X8) added 
## - log(X9) added 
## - log(X6) added 
## - log(X1) added 
## - log(X3) added 
## 
## No more variables to be added or removed.
k
## 
## 
##                              Stepwise Summary                              
## -------------------------------------------------------------------------
## Variable     Method      AIC       RSS     Sum Sq     R-Sq      Adj. R-Sq 
## -------------------------------------------------------------------------
## log(X4)     addition    44.925    6.429    64.964    0.90995      0.90673 
## log(X8)     addition    44.798    5.989    65.404    0.91611      0.90990 
## log(X9)     addition    14.099    2.014    69.379    0.97179      0.96854 
## log(X6)     addition     4.579    1.372    70.021    0.98079      0.97771 
## log(X1)     addition     2.166    1.184    70.209    0.98342      0.97996 
## log(X3)     addition     0.009    1.031    70.362    0.98556      0.98180 
## -------------------------------------------------------------------------
# plot(k)

# Stepwise Regression based on p values for full log model #
# k <- ols_step_both_p(model_wf_full_log_inter)
# k
# plot(k)

# Stepwise AIC Regression for all full model #
# k <- ols_step_both_aic(model_wf_full_log_inter)
# k
# plot(k)

# Stepwise Regression based on p values for all log model #
# k <- ols_step_both_p(model_wf_all_log_inter)
# k
# plot(k)

# Stepwise AIC Regression for all log model #
# k <- ols_step_both_aic(model_wf_all_log_inter)
# k
# plot(k)

# Stepwise Regression based on p values for all log model #
# k <- ols_step_both_p(model_wf_mix2 )
# k
# plot(k)

# k <- ols_step_both_aic(model_wf_mix2)
# k
# plot(k)

# Stepwise Regression based on p values for X4 eliminated model#
# k <- ols_step_both_p(model_wf_rm4_log)
# k
# plot(k)

# Stepwise AIC Regression for X4 eliminated model#
# k<- ols_step_both_aic(model_wf_rm4_log)
# k
# plot(k)

# Stepwise Regression based on p values for X1 eliminated model#
# k <- ols_step_both_p(model_wf_rm1_log)
# k
# plot(k)

# Stepwise AIC Regression for X1 eliminated model#
# k<- ols_step_both_aic(model_wf_rm1_log)
# k
# plot(k)
# All Possible Regression for full log model #
# k <- ols_step_all_possible(model_wf_full_log)
# plot(k)
# head(arrange(k, desc(adjr)))

# All Possible Regression for all log model #
# k <- ols_step_all_possible(model_wf_all_log)
# plot(k)
# head(arrange(k, desc(adjr)))

# All Possible Regression for 3g log model #
#!!!!!!!!!!!! k <- ols_step_all_possible(model_wf_3g_log_inter)
# plot(k)
# head(arrange(k, desc(adjr)))
 
# All Possible Regression for mixed log model #
# k <- ols_step_all_possible(model_wf_mix2 )
# plot(k)
#  head(arrange(k, desc(adjr)))

# All Possible Regression for X4 eliminated model #
# k <- ols_step_all_possible(model_wf_rm4_log)
# k
# plot(k)

# All Possible Regression for X1 eliminated model #
# k <- ols_step_all_possible(model_wf_rm1_log)
# k
# plot(k)
#Lack of Fit F Test

ols_pure_error_anova(lm(y~X1, data = table_wf))
ols_pure_error_anova(lm(y~X4, data = table_wf))

alias(lm(y ~ as.factor(X3) + as.factor(X4) + as.factor(X5) + as.factor(X6) + as.factor(X7), data=table_wf))

alias(lm(y ~ as.factor(X1) + as.factor(X8) , data=table_wf))

alias(lm(y ~ as.factor(X4) + as.factor(X9) , data=table_wf))

alias(lm(y ~ as.factor(X3) + as.factor(X6) + as.factor(X7) + as.factor(X8) + as.factor(X9) , data=table_wf))

Final models

  • Check model aic_all_log
ols_regress(model_wf_aic_all_log )
##                         Model Summary                         
## -------------------------------------------------------------
## R                       0.993       RMSE               0.208 
## R-Squared               0.986       Coef. Var          3.273 
## Adj. R-Squared          0.982       MSE                0.043 
## Pred R-Squared          0.975       MAE                0.136 
## -------------------------------------------------------------
##  RMSE: Root Mean Square Error 
##  MSE: Mean Square Error 
##  MAE: Mean Absolute Error 
## 
##                                ANOVA                                 
## --------------------------------------------------------------------
##                Sum of                                               
##               Squares        DF    Mean Square       F         Sig. 
## --------------------------------------------------------------------
## Regression     70.394         6         11.732    270.106    0.0000 
## Residual        0.999        23          0.043                      
## Total          71.393        29                                     
## --------------------------------------------------------------------
## 
##                                   Parameter Estimates                                    
## ----------------------------------------------------------------------------------------
##       model      Beta    Std. Error    Std. Beta      t        Sig      lower     upper 
## ----------------------------------------------------------------------------------------
## (Intercept)     0.571         3.360                  0.170    0.866    -6.379     7.522 
##     log(X1)     0.726         0.036        0.967    20.107    0.000     0.651     0.800 
##     log(X3)     0.419         0.096        0.139     4.359    0.000     0.220     0.617 
##     log(X5)     1.259         0.796        0.083     1.582    0.127    -0.387     2.905 
##     log(X6)    -0.267         0.090       -0.087    -2.960    0.007    -0.454    -0.080 
##     log(X8)     1.623         0.175        0.510     9.267    0.000     1.260     1.985 
##     log(X9)    -1.375         0.154       -0.503    -8.919    0.000    -1.694    -1.056 
## ----------------------------------------------------------------------------------------
summary(model_wf_aic_all_log)
## 
## Call:
## lm(formula = log(y) ~ log(X1) + log(X3) + log(X5) + log(X6) + 
##     log(X8) + log(X9), data = table_wf)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.67722 -0.08003  0.01102  0.13879  0.25715 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.57120    3.36000   0.170  0.86650    
## log(X1)      0.72550    0.03608  20.107 4.31e-16 ***
## log(X3)      0.41866    0.09605   4.359  0.00023 ***
## log(X5)      1.25873    0.79566   1.582  0.12731    
## log(X6)     -0.26702    0.09022  -2.960  0.00702 ** 
## log(X8)      1.62253    0.17508   9.267 3.15e-09 ***
## log(X9)     -1.37489    0.15416  -8.919 6.33e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2084 on 23 degrees of freedom
## Multiple R-squared:  0.986,  Adjusted R-squared:  0.9824 
## F-statistic: 270.1 on 6 and 23 DF,  p-value: < 2.2e-16
Anova(model_wf_aic_all_log)
Sum Sq Df F value Pr(>F)
17.6   1 404    4.31e-16
0.825 1 19    0.00023 
0.109 1 2.5  0.127   
0.38  1 8.76 0.00702 
3.73  1 85.9  3.15e-09
3.45  1 79.5  6.33e-09
0.999 23            
# Collinearity Diagnostics #
ols_vif_tol(model_wf_aic_all_log)
Variables Tolerance VIF
log(X1) 0.263 3.8 
log(X3) 0.603 1.66
log(X5) 0.22  4.55
log(X6) 0.71  1.41
log(X8) 0.201 4.99
log(X9) 0.191 5.22
#Model Fit Assessment
ols_plot_diagnostics(model_wf_aic_all_log)

# Part & Partial Correlations
ols_test_correlation(model_wf_aic_all_log) # Correlation between observed residuals and expected residuals under normality.
## [1] 0.9278999
# Residual Normality Test
ols_test_normality(model_wf_aic_all_log) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
## -----------------------------------------------
##        Test             Statistic       pvalue  
## -----------------------------------------------
## Shapiro-Wilk              0.8746         0.0021 
## Kolmogorov-Smirnov        0.0964         0.9180 
## Cramer-von Mises          7.0221         0.0000 
## Anderson-Darling          0.7277         0.0516 
## -----------------------------------------------
# Variable Contributions
ols_plot_added_variable(model_wf_aic_all_log)

# Residual Plus Component Plot
ols_plot_comp_plus_resid(model_wf_aic_all_log)

  • Check model aic_mix1
summary(model_wf_mix1)
## 
## Call:
## lm(formula = log(y) ~ (X1 + X3 + X4)^2 + log(X2 + X5 + X6 + X7) + 
##     log(X8) + log(X9), data = table_wf)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.59020 -0.09016 -0.00499  0.07966  0.35476 
## 
## Coefficients: (1 not defined because of singularities)
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            -0.17550    5.79376  -0.030   0.9761    
## X1                      1.34608    0.26376   5.103 4.70e-05 ***
## X3                      0.31833    0.06162   5.166 4.06e-05 ***
## X4                      0.39212    0.04259   9.207 8.07e-09 ***
## log(X2 + X5 + X6 + X7)  0.62509    1.30490   0.479   0.6369    
## log(X8)                 1.53840    0.19147   8.035 7.68e-08 ***
## log(X9)                -1.31591    0.16825  -7.821 1.18e-07 ***
## X1:X3                  -0.39809    0.06721  -5.923 7.04e-06 ***
## X1:X4                   0.03412    0.01452   2.350   0.0286 *  
## X3:X4                        NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2307 on 21 degrees of freedom
## Multiple R-squared:  0.9844, Adjusted R-squared:  0.9784 
## F-statistic: 165.1 on 8 and 21 DF,  p-value: < 2.2e-16
Anova(model_wf_mix1)
Sum Sq Df F value Pr(>F)
0.434   1 8.16   0.00943 
0.00231 1 0.0434 0.837   
5.91    1 111      7.63e-10
0.0122  1 0.229  0.637   
3.43    1 64.6    7.68e-08
3.25    1 61.2    1.18e-07
       0              
       0              
       0              
1.12    21              
# Collinearity Diagnostics #
ols_vif_tol(model_wf_mix1)
Variables Tolerance VIF
X1 0     Inf   
X3 0     Inf   
X4 0     Inf   
log(X2 + X5 + X6 + X7) 0.112 8.91
log(X8) 0.205 4.87
log(X9) 0.197 5.08
X1:X3 0     Inf   
X1:X4 0     Inf   
X3:X4 0     Inf   
#Model Fit Assessment
ols_plot_diagnostics(model_wf_mix1)

# Part & Partial Correlations
ols_test_correlation(model_wf_mix1) # Correlation between observed residuals and expected residuals under normality.
## [1] 0.9658267
# Residual Normality Test
ols_test_normality(model_wf_mix1) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #
## -----------------------------------------------
##        Test             Statistic       pvalue  
## -----------------------------------------------
## Shapiro-Wilk              0.9419         0.1026 
## Kolmogorov-Smirnov        0.1319         0.6261 
## Cramer-von Mises          6.9129         0.0000 
## Anderson-Darling          0.6116         0.1016 
## -----------------------------------------------
# Variable Contributions
ols_plot_added_variable(model_wf_mix1)

# Residual Plus Component Plot
ols_plot_comp_plus_resid(model_wf_mix1)

  • Check model 3g
# summary(model_wf_3g_aic_log_inter)
# Anova(model_wf_3g_aic_log_inter)

# Collinearity Diagnostics #
# ols_vif_tol(model_wf_3g_aic_log_inter)

#Model Fit Assessment
# ols_plot_diagnostics(model_wf_3g_aic_log_inter)

# Part & Partial Correlations
# ols_test_correlation(model_wf_3g_aic_log_inter) # Correlation between observed residuals and expected residuals under normality.

# Residual Normality Test
# ols_test_normality(model_wf_3g_aic_log_inter) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #

# Variable Contributions
# ols_plot_added_variable(model_wf_3g_aic_log_inter)

# Residual Plus Component Plot
# ols_plot_comp_plus_resid(model_wf_3g_aic_log_inter)
  • Check model all log interaction
summary(model_wf_aic_all_log_inter)
## 
## Call:
## lm(formula = log(y) ~ log(X1) + log(X2) + log(X3) + log(X4) + 
##     log(X5) + log(X6) + log(X7) + log(X8) + log(X9) + log(X1):log(X3) + 
##     log(X1):log(X8) + log(X1):log(X9) + log(X2):log(X8) + log(X2):log(X9) + 
##     log(X3):log(X9) + log(X4):log(X8) + log(X4):log(X9) + log(X5):log(X8) + 
##     log(X5):log(X9) + log(X6):log(X8) + log(X6):log(X9) + log(X7):log(X9), 
##     data = table_wf)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.114032 -0.038669 -0.003953  0.026220  0.160039 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)   
## (Intercept)     -16.0272    13.9469  -1.149  0.28823   
## log(X1)           1.1577     0.6966   1.662  0.14045   
## log(X2)          -0.7328     0.3383  -2.166  0.06698 . 
## log(X3)           1.6656     1.7630   0.945  0.37625   
## log(X4)          -2.3936     2.9633  -0.808  0.44581   
## log(X5)           5.0279     3.4728   1.448  0.19094   
## log(X6)          -0.1583     0.3004  -0.527  0.61463   
## log(X7)          -0.4075     0.5183  -0.786  0.45748   
## log(X8)          44.9189    24.1746   1.858  0.10550   
## log(X9)         -37.0656    19.1702  -1.934  0.09443 . 
## log(X1):log(X3)   0.8234     0.7067   1.165  0.28218   
## log(X1):log(X8)   1.2421     0.6407   1.939  0.09372 . 
## log(X1):log(X9)  -0.9995     0.6739  -1.483  0.18156   
## log(X2):log(X8)   1.0890     0.3972   2.742  0.02885 * 
## log(X2):log(X9)  -0.7095     0.3415  -2.078  0.07633 . 
## log(X3):log(X9)   0.4112     0.3562   1.154  0.28626   
## log(X4):log(X8)  -3.0288     0.8494  -3.566  0.00915 **
## log(X4):log(X9)   2.1744     0.9286   2.342  0.05172 . 
## log(X5):log(X8)  -7.9687     5.5620  -1.433  0.19504   
## log(X5):log(X9)   6.7951     4.4498   1.527  0.17059   
## log(X6):log(X8)   0.4027     0.2891   1.393  0.20622   
## log(X6):log(X9)  -0.5057     0.2651  -1.908  0.09807 . 
## log(X7):log(X9)   0.4637     0.3759   1.234  0.25719   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1252 on 7 degrees of freedom
## Multiple R-squared:  0.9985, Adjusted R-squared:  0.9936 
## F-statistic: 206.6 on 22 and 7 DF,  p-value: 7.713e-08
Anova(model_wf_aic_all_log_inter)
Sum Sq Df F value Pr(>F)
0.281    1 17.9    0.00386 
0.00972  1 0.619  0.457   
0.0102   1 0.648  0.447   
0.000172 1 0.0109 0.92    
0.0311   1 1.98   0.202   
0.068    1 4.33   0.0759  
0.00234  1 0.149  0.711   
1.7      1 108      1.65e-05
1.73     1 111      1.53e-05
0.0213   1 1.36   0.282   
0.059    1 3.76   0.0937  
0.0345   1 2.2    0.182   
0.118    1 7.52   0.0288  
0.0677   1 4.32   0.0763  
0.0209   1 1.33   0.286   
0.199    1 12.7    0.00915 
0.086    1 5.48   0.0517  
0.0322   1 2.05   0.195   
0.0366   1 2.33   0.171   
0.0304   1 1.94   0.206   
0.0571   1 3.64   0.0981  
0.0239   1 1.52   0.257   
0.11     7              
# Collinearity Diagnostics #
# ols_vif_tol(model_wf_aic_all_log_inter)

#Model Fit Assessment
# ols_plot_diagnostics(model_wf_aic_all_log_inter)

# Part & Partial Correlations
# ols_test_correlation(model_wf_aic_all_log_inter) # Correlation between observed residuals and expected residuals under normality.

# Residual Normality Test
# ols_test_normality(model_wf_aic_all_log_inter) # Test for detecting violation of normality assumption. #If p-value is bigger, then no problem of non-normality #

# Variable Contributions
# ols_plot_added_variable(model_wf_aic_all_log_inter)

# Residual Plus Component Plot
# ols_plot_comp_plus_resid(model_wf_aic_all_log_inter)